背景及目標(biāo):現(xiàn)有數(shù)據(jù)1000w單表,為壓力測(cè)試準(zhǔn)備1億條數(shù)據(jù)。
步驟:
1.將1000w條記錄,除id外都導(dǎo)入到多個(gè)文件中:
//DELIMITER
DROP PROCEDURE if EXISTS createManyTable;
create PROCEDURE createManyTable()
BEGIN
DECLARE i int;
DECLARE fileName VARCHAR(30);
set i=1;
while i<251
DO
SET fileName=CONCAT('f_log_',i,'.txt');
SET @STMT :=CONCAT("select `xx`,`xx`,`xx`,`xx`,.... into outfile 'temp/",fileName,
"' lines terminated by 'rn' from `f_log` WHERE id>= ",40000*(i-1)," AND id< ",40000*i);
PREPARE STMT FROM @STMT;
EXECUTE STMT;
set i=i+1;
end while;
END;
//DELIMITER
CALL createManyTable();
2. 將上述多個(gè)文件擴(kuò)充10倍并合并到同一個(gè)文件,并且在第一列加入id列:
public static void main(String[] args) throws IOException {
int i=10000000;
int step=40000;
File out=new File("E:/data/f_log_data.txt");
for(int k=1;k<251;k++){
File file=new File("E:/data/temp/f_log_"+k+".txt");
StringBuffer sb=new StringBuffer();
if(file.exists()){
sb=readFile(file,i+step*k);
writeFile11(out,sb);
}
}
}
public static StringBuffer readFile(File file,int start) throws IOException{
StringBuffer sb=new StringBuffer();
BufferedReader reader=new BufferedReader(new FileReader(file));
String line="";
while(line != null){
line = reader.readLine();
if(line == null){
break;
}
if(line.trim().equalsIgnoreCase("")){
continue;
}
start++;
sb.Append(start+"t"+line.trim()+"rn");
}
reader.close();
return sb;
}
public static void writeFile(File file,StringBuffer sb) throws IOException{
BufferedWriter writer = new BufferedWriter(new FileWriter(file, true));
writer.write(sb.toString());
writer.close();
}
public void writeFile11() throws IOException{
// TODO Auto-generated method stub
BufferedWriter writer = new BufferedWriter(new FileWriter(new File("D:/driver/data.txt"), true));
for(int i=0;i<1000000;i++){
if(i%10==0){
writer.write("趙"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}if(i%10==1){
writer.write("錢(qián)"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}
if(i%10==2){
writer.write("孫"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}if(i%10==3){
writer.write("李"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}
if(i%10==4){
writer.write("鄭"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}if(i%10==5){
writer.write("吳"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}
if(i%10==6){
writer.write("周"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}if(i%10==7){
writer.write("王"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}
if(i%10==8){
writer.write("張"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}if(i%10==9){
writer.write("劉"+(i/10)+"t"+ (int)(Math.random()*100)+"n");
}
}
writer.close();
}
3. 將合并后的文件導(dǎo)入到數(shù)據(jù)表中:
load data local infile '/tmp/finance_log_data.txt' into table f_log(`id`,`xx`, `xx`,......................... );
注意事項(xiàng):開(kāi)始考慮使用存儲(chǔ)過(guò)程來(lái)逐步導(dǎo)入到數(shù)據(jù)表中,但load data命令不能在存儲(chǔ)過(guò)程中使用。
另外,數(shù)據(jù)的合并也可以以shell腳本完成,但習(xí)慣使用JAVA了,因此以java來(lái)完成,顯得比較復(fù)雜。不過(guò),可以隨便復(fù)習(xí)一下java的讀寫(xiě)文件,有算不錯(cuò)的經(jīng)歷。
Q&A
時(shí)間問(wèn)題:生成1億條數(shù)據(jù)(在有索引的情況下),用時(shí)3個(gè)小時(shí)。如果使用insert語(yǔ)句,估計(jì)會(huì)瘋掉!






