How to do bulk inserts in a DB table-rails
the most common case you will be doing while bulk inserts in DB via a CSV file data read or some kind of API feed data storage.
in a novice approach, we will just run through the all csv/api data and build the database table data and insert it.
CSV.foreach("path/to/product.csv", header: true) do |csv_row|
begin
Product.create!(
name: csv_row[0],
factory_price: csv_row[1],
sell_price: csv_row[2]
)
rescue
puts "Hey! I need help => #{csv_row.class}:#{csv_row.inspect}"
next
end
end
with this approach, It will create an INSERT
SQL statement
INSERT INTO products (name, factory_price, sell_price) VALUES ('Rose', 5.0, 10.5);
And it will be running this INSERT
query N times (N is the count for CSV file rows)
Not very efficient right? 😎
Let’s optimize the above SQL insert using batch.
# initialize and insert the product to products array
products = []
CSV.foreach("path/to/product.csv", header: true) do |csv_row|
products << %Q{ ('#{csv_row[0]}','#{csv_row[1]}','#{csv_row[2]}') }
end
# create raw sql to insert bulk values
bulk_products_insert_sql =
"INSERT INTO products (name, factory_price, sell_price)"\
" VALUES#{products.join(', ')}"
# excute the insert query
ActiveRecord::Base.connection.execute(bulk_products_insert_sql)
It will generate the SQL similar to this
INSERT INTO products (name, factory_price, sell_price)
VALUES('product1', 1.0, 5.5), ('product2', 2.0, 5.0),
('product1', 3.0, 4.5), ('product2', 4.0, 10.0),
('product1', 5.0, 10.5), ('product2', 6.0, 8.0),
('product1', 7.0, 12.5), ('product2', 8.0, 20.0),
...
...
Note: if you have much larger CSV data perform the bulk insert in batches of 1000-2000
so that we shouldn’t reach the DB max query or some other limits.
ex:
# initialize and insert the product to products array
products = []
CSV.foreach("path/to/product.csv", header: true) do |csv_row|
products << %Q{ ('#{csv_row[0]}','#{csv_row[1]}','#{csv_row[2]}') }
end
# create raw SQL to insert bulk values with a batch of 1500
in_batch_of = 1500
products.each_slice(in_batch_of) do |products_batch|
bulk_products_insert_sql =
"INSERT INTO products (name, factory_price, sell_price)"\
" VALUES#{products_batch.join(', ')}"
# excute the insert query
ActiveRecord::Base.connection.execute(bulk_products_insert_sql)
end