Home Spring Boot Spring JdbcTemplate - Batch Insert/Update with maxPerformance

Spring JdbcTemplate – Batch Insert/Update with maxPerformance

This guide provides examples on Batch Insert/Update using Spring JdbcTemplate and explains how to run batch Inserts asynchronously/concurrently to optimize performance to MySql to work on large data with million records with maxPerformance.

1. JdbcTemplate Batch Inserts Example

Insert batch example using JdbcTemplate batchUpdate() operation. Batch Insert/Update operations must be Transactional. For batch operations you can use batch update callback BatchPreparedStatementSetter  to set parameter values.

	@Transactional(propagation = Propagation.REQUIRES_NEW)
	public void batchInsert(List<User> users) {
		StopWatch timer = new StopWatch();

		String sql = "INSERT INTO `USER` (USERNAME, PASSWORD, CREATEDTIME, UPDATEDTIME, USERTYPE, DATEOFBIRTH)"
				+ " VALUES(?,?,?,?,?,?)";

		timer.start(); 
		jdbcTemplate.batchUpdate(sql, new BatchPreparedStatementSetter() {

			@Override
			public void setValues(PreparedStatement ps, int i) throws SQLException {
				User user = users.get(i);
				ps.setString(1, user.getUserName());
				ps.setString(2, user.getPassword());
				ps.setDate(3, new java.sql.Date(user.getCreatedTime().getTime()));
				ps.setDate(4,
						user.getUpdatedTime() == null ? null : new java.sql.Date(user.getUpdatedTime().getTime()));
				ps.setString(5, user.getUserType().toString());
				ps.setDate(6, new java.sql.Date(user.getDateofBirth().getTime()));
			}

			@Override
			public int getBatchSize() {
				return users.size();
			}
		});

		timer.stop();
		log.info("batchInsert -> Total time in seconds: " + timer.getTotalTimeSeconds());
	}

To re-use BatchPreparedStatementSetter callback implementation in other operations, that part of code moving to a separate class.

public class UserBatchPreparedStatementSetter implements BatchPreparedStatementSetter{

	private List<User> users;
	
	public UserBatchPreparedStatementSetter(List<User> users) {
		super();
		this.users = users;
	}

	@Override
	public void setValues(PreparedStatement ps, int i) {
		
		try {
			User user = users.get(i);
			ps.setString(1, user.getUserName());
			ps.setString(2, user.getPassword());
			ps.setDate(3, new java.sql.Date(user.getCreatedTime().getTime()));
			ps.setDate(4, user.getUpdatedTime() == null ? null : new java.sql.Date(user.getUpdatedTime().getTime()));
			ps.setString(5, user.getUserType().toString());
			ps.setDate(6, new java.sql.Date(user.getDateofBirth().getTime()));
		} catch (SQLException e) {
			e.printStackTrace();
		}
	}

	@Override
	public int getBatchSize() {
		return users.size();
	}
}

After code changes, final batchInsert() operation :

	@Transactional(propagation = Propagation.REQUIRES_NEW)
	public void batchInsert(List<User> users) {
		StopWatch timer = new StopWatch();

		String sql = "INSERT INTO `USER` (USERNAME, PASSWORD, CREATEDTIME, UPDATEDTIME, USERTYPE, DATEOFBIRTH)"
				+ " VALUES(?,?,?,?,?,?)";

		timer.start(); 
		jdbcTemplate.batchUpdate(sql, new UserBatchPreparedStatementSetter(users));

		timer.stop();
		log.info("batchInsert -> Total time in seconds: " + timer.getTotalTimeSeconds());
	}

2. NamedParameterJdbcTemplate Batch Update Example

We can use same JdbcTemplate batchUpdate() operation for SQL bulk update queries, for variation let’s use NamedParameterJdbcTemplate.

	@Transactional(propagation = Propagation.REQUIRES_NEW)
	public void batchUpdate(List<User> users) {
		StopWatch timer = new StopWatch();
		String sql = "UPDATE `USER` SET USERNAME=:name, UPDATEDTIME=:updatedAt WHERE ID=:id";
	
		List<MapSqlParameterSource> params = new ArrayList<MapSqlParameterSource>();

		for (User u : users) {
			MapSqlParameterSource source = new MapSqlParameterSource();
			source.addValue("name", u.getUserName().toUpperCase());
			source.addValue("updatedAt", new Date());
			source.addValue("id", u.getId());

			params.add(source);
		}

		timer.start(); 
		namedJdbcTemplate.batchUpdate(sql, params.toArray(MapSqlParameterSource[]::new));

		timer.stop();
		log.info("batchUpdate -> Total time in seconds: " + timer.getTotalTimeSeconds());
	}

We have seen couple of examples on how to do batch operations using JdbcTemplate, let’s see performance and optimizations.

3. Optimizing Performance

3.1. Let’s run above examples against Mysql database and compare execution times on 1000000 sample records. Following are the results on 1.2 Million sample records.

batchInsert -> Total time in seconds: 774.8934786    - (13 Minutes Apprx)
batchUpdate -> Total time in seconds: 955.1260053   - (16 Minutes Apprx)

3.2. To improve performance of query execution for mysql, we can apply properties maxPerformance and rewriteBatchedStatements=true to the data source configuration.

(Note : Other database bases may not need these properties, their default batch query execution strategy might be different. You have to make sure about that before applying these properties for other databases. And have a look into database driver properties whether there is any optimization properties available to improve performance for other databases )

### Optimization
spring.datasource.hikari.data-source-properties.useConfigs=maxPerformance
spring.datasource.hikari.data-source-properties.rewriteBatchedStatements=true

3.3. Let’s run again and notice performance.

batchInsert -> Total time in seconds: 143.1325339     -  (2.5 Minutes Apprx)
batchUpdate -> Total time in seconds: 915.4360036   - (15 Minutes Apprx)

You can notice, Insert query has great improvement on execution time, but update query doesn’t have any improvement. When you set rewriteBatchedStatements=true then Jdbc groups several sql insert query statements into single insert query and executes it instead of running each sql query. So that number of round trips to the DB will be reduced.

Example : If rewriteBatchedStatements=flase : (Multiple Sql Statements)

INSERT INTO `USER` (ID, USER_NAME, PASSWORD, EMAIL, CREATED_TIME, UPDATED_TIME, USER_TYPE, DOB) VALUES (1,'PeterM','ABC123abc*','[email protected]','2020-03-17 07:13:30',NULL,'STUDENT','2020-03-17');
INSERT INTO `USER` (ID, USER_NAME, PASSWORD, EMAIL, CREATED_TIME, UPDATED_TIME, USER_TYPE, DOB) VALUES (2,'Mike','password','[email protected]','2020-03-18 14:59:35',NULL,'EMPLOYEE','2020-03-18');
INSERT INTO `USER` (ID, USER_NAME, PASSWORD, EMAIL, CREATED_TIME, UPDATED_TIME, USER_TYPE, DOB) VALUES (3,'KingPeter','password','[email protected]','2020-03-19 12:19:15',NULL,'EMPLOYEE','2020-03-18');

Example : If rewriteBatchedStatements=true : (Groups into single Sql Statement, and several groups)

INSERT INTO `USER` VALUES 
     (1,'PeterM','ABC123abc*','[email protected]','2020-03-17 07:13:30',NULL,'STUDENT','2020-03-17'),
     (2,'Mike','password','[email protected]','2020-03-18 14:59:35',NULL,'EMPLOYEE','2020-03-18'),
     (3,'KingPeter','password','[email protected]','2020-03-19 12:19:15',NULL,'EMPLOYEE','2020-03-18');

3.4. rewriteBatchedStatements=true works well for insert queries but not for update queries. To verify queries, enable fullDebug to the datasource configuration like spring.datasource.hikari.data-source-properties.useConfigs=maxPerformance,fullDebug . To improve update queries execution time, building static update queries like following example works much better.

UPDATE `USER` 	 	 
SET USERNAME = CASE	 	 
 WHEN ID=1 THEN 'PETERM'	 	 
 WHEN ID=2 THEN 'MARTIN'
 WHEN ID=3 THEN 'ASHJA'
END,	 	 
PASSWORD = CASE	 	 
  WHEN ID=1 THEN 'ABC123abc*'	 	 
  WHEN ID=2 THEN 'password'
  WHEN ID=3 THEN 'Password123'
END	 	 
WHERE ID IN ( 1,2,3);

4. Optimizing with Asynchronous Batch Updates

Still we can improve performance using multi threading concepts for insert queries. Let’s have look into asynchronous batch execution using CompletableFuture. Following image illustrates a simple strategy on how we can perform batch inserts asynchronously.

jdbctemplate batch update

4.1. Asynchronous batch Insert example

	@Autowired
	private JdbcTemplate jdbcTemplate;

	@Autowired
	private NamedParameterJdbcTemplate namedJdbcTemplate;

	@Value("${jdbc.batch_insert_size}")
	private int batchSize;

	private static final ExecutorService executor = Executors.newFixedThreadPool(10);

	@Transactional(propagation = Propagation.REQUIRES_NEW)
	public void batchInsertAsync(List<User> users) throws InterruptedException, ExecutionException {
		StopWatch timer = new StopWatch();

		String sql = "INSERT INTO `USER` (USERNAME, PASSWORD, CREATEDTIME, UPDATEDTIME, USERTYPE, DATEOFBIRTH)"
				+ " VALUES(?,?,?,?,?,?)";

		final AtomicInteger sublists = new AtomicInteger();

		CompletableFuture[] futures = users.stream()
				.collect(Collectors.groupingBy(t -> sublists.getAndIncrement() / batchSize))
				.values()
				.stream()
				.map(ul -> runBatchInsert(ul, sql))
				.toArray(CompletableFuture[]::new);

		CompletableFuture<Void> run = CompletableFuture.allOf(futures);

		timer.start();
		run.get();
		timer.stop();

		log.info("batchInsertAsync -> Total time in seconds: " + timer.getTotalTimeSeconds());
	}

	public CompletableFuture<Void> runBatchInsert(List<User> users, String sql) {
		return CompletableFuture.runAsync(() -> {
			jdbcTemplate.batchUpdate(sql, new UserBatchPreparedStatementSetter(users));
		}, executor);
	}

4.2. Run and verify execution time

To improve execution time adjust values for spring.datasource.hikari.maximumPoolSize, jdbc.batch_insert_size, ExecutorService thread count to get best results.

batchInsertAsync -> Total time in seconds: 81.0693804   - (1.4 Mins Apprx)

5. Improving Batch Update Queries

I haven’t seen much improvement when I ran batch update queries asynchronously. Like stated in 3.4 section, static queries perform much better for batch update statements. Let’s have a look into an example.

5.1. Creating a Static Query for update Statements

public class BatchQueryBuilder {

	public static String buildUpdateQuery(List<User> users) {
		
		StringBuffer query = new StringBuffer("UPDATE `USER` SET USERNAME = CASE ");
		StringBuffer setUserName = new StringBuffer();
		StringBuffer setUpdatedTime = new StringBuffer();
		
		users.forEach(u -> {
			setUserName.append(String.format("WHEN ID=%d THEN '%s' ", u.getId(), u.getUserName().toLowerCase()));
			setUpdatedTime.append(String.format("WHEN ID=%d THEN NOW() ", u.getId()));
			});
		
		query.append(setUserName)
		.append("END, PASSWORD = CASE ")
		.append(setUpdatedTime)
		.append(String.format("END WHERE ID IN (%s)", 
				String.join(",", users.stream().map(u -> u.getId()+"").collect(Collectors.toList()))));
		
		return query.toString();
	}
}
	@Transactional(propagation = Propagation.REQUIRES_NEW)
	public void batchUpdateStatic(List<User> users) throws InterruptedException, ExecutionException {
		StopWatch timer = new StopWatch();

		final AtomicInteger sublists = new AtomicInteger();

		String[] queries = users.stream()
				.collect(Collectors.groupingBy(t -> sublists.getAndIncrement() / batchUpdateSize))
				.values()
				.stream()
				.map(ul -> BatchQueryBuilder.buildUpdateQuery(ul))
				.toArray(String[]::new);

		timer.start();
		//System.out.println(Arrays.asList(queries));
		jdbcTemplate.batchUpdate(queries);
		timer.stop();

		log.info("batchUpdateStatic -> Total time in seconds: " + timer.getTotalTimeSeconds());
	}

5.2. Run and verify execution time

Following is the execution results on 1.2 Million records. Still not up to the insert statement results, but shown much better results than earlier results.

batchInsertAsync -> Total time in seconds: 261.53663903   - (4.5 Mins Apprx)

Note: Keep batchUpdateSize <= 1000, it may slow downs the query and some data bases may not support more than 1000 values in IN clause. Refer Spring JdbcTemplate – IN Clause

6. Conclusion

In this tutorial we have covered several examples on Batch Insert/Update using Spring JdbcTemplate and explained how to run Batch Inserts asynchronously/concurrently to optimize performance to MySql to work on large data with million records.

Checkout source code at GitHub.

Other Spring JDBC Examples :

  1. Spring JdbcTemplate – Stored Procedure
  2. Spring JdbcTemplate – Query
  3. Spring JdbcTemplate – Pagination
  4. Spring JdbcTemplate – IN Clause
  5. Spring JdbcTemplate – Auto-generated Keys
  6. Spring Boot – Loading Initial Data

7. References

  1. MySql Configuration Properties
  2. MySql Multiple updates
  3. Spring Document
  4. Spring Boot Load Initial Data
  5. Spring Boot JUnit 5 Test

Satish Varma
Satish Varmahttps://javabydeveloper.com
Satish is post graduated in master of computer applications and experienced software engineer with focus on Spring, JPA, REST, TDD and web development. Follow him on LinkedIn or Twitter or Facebook

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay in Touch

Categories