在 Spring 项目中操作 ClickHouse 数据库,可通过 JDBC 或集成第三方库(如 ClickHouse Native JDBC、Spring Data JDBC 扩展)实现。以下是详细步骤和示例:

1. 添加依赖

<!-- ClickHouse JDBC 驱动 -->
<dependency>
    <groupId>com.clickhouse</groupId>
    <artifactId>clickhouse-jdbc</artifactId>
    <version>0.4.6</version>
</dependency>

<!-- Spring Boot Starter Data JDBC -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jdbc</artifactId>
</dependency>

<!-- Lombok (可选) -->
<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <optional>true</optional>
</dependency>

2. 配置数据源

application.properties 中添加 ClickHouse 连接配置:

spring.datasource.url=jdbc:clickhouse://localhost:8123/default
spring.datasource.username=default
spring.datasource.password=
spring.datasource.driver-class-name=com.clickhouse.jdbc.ClickHouseDriver

3. 定义实体类

import lombok.Data;

@Data
public class User {
    private Long id;
    private String name;
    private Integer age;
    private java.time.LocalDateTime createTime;
}

4. 方式一:使用 JdbcTemplate

import org.springframework.jdbc.core.BeanPropertyRowMapper;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Repository;

import java.time.LocalDateTime;
import java.util.List;

@Repository
public class UserRepository {
    private final JdbcTemplate jdbcTemplate;

    public UserRepository(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }

    // 插入数据
    public void save(User user) {
        String sql = "INSERT INTO users (id, name, age, create_time) VALUES (?, ?, ?, ?)";
        jdbcTemplate.update(sql, user.getId(), user.getName(), user.getAge(), LocalDateTime.now());
    }

    // 查询所有
    public List<User> findAll() {
        String sql = "SELECT id, name, age, create_time FROM users";
        return jdbcTemplate.query(sql, new BeanPropertyRowMapper<>(User.class));
    }

    // 批量插入(优化性能)
    public void batchSave(List<User> users) {
        String sql = "INSERT INTO users (id, name, age, create_time) VALUES (?, ?, ?, ?)";
        jdbcTemplate.batchUpdate(sql, users, users.size(), (ps, user) -> {
            ps.setLong(1, user.getId());
            ps.setString(2, user.getName());
            ps.setInt(3, user.getAge());
            ps.setObject(4, user.getCreateTime());
        });
    }
}

5. 方式二:使用 Spring Data JDBC(扩展)

步骤1:创建 Repository 接口

import org.springframework.data.repository.CrudRepository;

public interface UserSpringDataRepository extends CrudRepository<User, Long> {
    // Spring Data 自动生成基本 CRUD 方法
}

步骤2:配置 ClickHouse 方言

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.jdbc.core.convert.JdbcCustomConversions;
import org.springframework.data.jdbc.repository.config.AbstractJdbcConfiguration;
import org.springframework.data.relational.core.dialect.Dialect;
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcOperations;

import java.util.Collections;

@Configuration
public class ClickHouseJdbcConfig extends AbstractJdbcConfiguration {

    @Override
    @Bean
    public Dialect jdbcDialect(NamedParameterJdbcOperations operations) {
        // 使用自定义方言或默认方言
        return new org.springframework.data.relational.core.dialect.PostgresDialect();
        // 注意:ClickHouse 与 PostgreSQL 语法有差异,可能需要自定义方言
    }

    @Override
    @Bean
    public JdbcCustomConversions jdbcCustomConversions() {
        return new JdbcCustomConversions(Collections.emptyList());
    }
}

6. 优化 ClickHouse 性能

批量插入

public void batchInsert(List<User> users) {
    String sql = "INSERT INTO users (id, name, age) VALUES ";
    StringBuilder sb = new StringBuilder(sql);
    
    for (int i = 0; i < users.size(); i++) {
        sb.append("(?, ?, ?)");
        if (i < users.size() - 1) {
            sb.append(",");
        }
    }
    
    jdbcTemplate.update(sb.toString(), users.stream()
            .flatMap(user -> Stream.of(user.getId(), user.getName(), user.getAge()))
            .toArray());
}

异步操作

import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Mono;
import reactor.core.scheduler.Schedulers;

@Service
public class AsyncUserService {
    private final UserRepository repository;

    public AsyncUserService(UserRepository repository) {
        this.repository = repository;
    }

    @Async("asyncExecutor")
    public Mono<Void> saveAsync(User user) {
        return Mono.fromRunnable(() -> repository.save(user))
                .subscribeOn(Schedulers.boundedElastic())
                .then();
    }
}

7. 注意事项

  1. 数据类型映射

    • ClickHouse 的 DateTime 对应 Java 的 java.time.LocalDateTime
    • ClickHouse 的 Array 类型需特殊处理(可使用 Jackson 序列化)。
  2. 批量操作

    • ClickHouse 适合批量写入,避免单条插入性能问题。
    • 使用 INSERT ... VALUES (...) 而非多条 INSERT 语句。
  3. 表引擎选择

    • 常用引擎:MergeTree(支持索引)、Memory(临时数据)。
    • 示例建表语句:
      CREATE TABLE users (
          id UInt64,
          name String,
          age UInt8,
          create_time DateTime
      ) ENGINE = MergeTree()
      ORDER BY id;
      
  4. 连接池配置

    # 增加连接池配置(HikariCP)
    spring.datasource.hikari.maximum-pool-size=20
    spring.datasource.hikari.minimum-idle=5
    spring.datasource.hikari.idle-timeout=30000
    

8. 测试代码

import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;

import java.util.Arrays;
import java.util.List;

import static org.junit.jupiter.api.Assertions.assertFalse;

@SpringBootTest
public class UserRepositoryTest {
    @Autowired
    private UserRepository repository;

    @Test
    public void testBatchInsert() {
        List<User> users = Arrays.asList(
                new User(1L, "Alice", 25),
                new User(2L, "Bob", 30)
        );
        
        repository.batchSave(users);
        List<User> result = repository.findAll();
        
        assertFalse(result.isEmpty());
    }
}

总结

通过 Spring 的 JdbcTemplate 或 Spring Data JDBC 可便捷操作 ClickHouse,需注意 ClickHouse 的特殊语法和性能优化策略(如批量插入、合理选择表引擎)。对于复杂场景,可结合异步编程提升吞吐量。