Manu

Seeding 1 million test data in postgres using sequel

This is a note of things that I did for my testing setup.

While exploring postgres and testing few features I had to setup a database for it and seed with lot of data. Below are my notes while doing the same.

The aim is to create 1 million entries in postgres in ruby using sequel and faker gem. This post is very comprehensive and will be sharing the commands and steps which are self explanatory. Most of the database related task has been done using psql.

Creating the database and table

create database text_search_db;

create table users(
	id BIGINT PRIMARY KEY NOT NULL,
	first_name text,
	second_name text,
	middle_name text,
	age INT,
	sex text,
	phone_number text
	);

Generating the data using faker and sequel gems.

require "faker"
require "sequel"

starting = Process.clock_gettime(Process::CLOCK_MONOTONIC)

# DB connection
DB = Sequel.postgres(host: 'localhost', user: 'manu', password: "", database: 'text_search_dev')

# We can initiate a table mapping in sequel like below
class User < Sequel::Model(:users)
end

# Loop a million times
1_000_000.times do |i|
  first_name = Faker::Name.first_name
  middle_name = Faker::Name.middle_name
  second_name = Faker::Name.last_name
  age = Faker::Number.between(from: 18, to: 80)
  sex = Faker::Gender.binary_type
  phone_number = Faker::PhoneNumber.cell_phone_in_e164

  User.create(
    first_name: first_name,
    middle_name: middle_name,
    second_name: second_name,
    age: age,
    sex: sex,
    phone_number: phone_number
  )
end

ending = Process.clock_gettime(Process::CLOCK_MONOTONIC)

elapsed = ending - starting

puts "Time taken: #{elapsed}"

faker is a really good gem at generating data, it should suffice most of the use cases. Similarly I find sequel to be a really great gem, I am starting to use it more and more in personal projects. I like how minimal it is yet functional, reminds me of ecto library of elixir.