banner



How To Add To A Database With Activerecord

Data migration is a fragile and sometimes complicated and time-consuming process. Whether you are loading data from a legacy awarding to a new application or you just desire to move data from i database to another, you lot'll most likely demand to create a migration script that will be accurate, efficient, and fast to assist with the process — especially if you are planning to load a huge amount of data.

There are several means you can load data from an erstwhile Track app or other application to Rails. In this commodity, I'll explain a few means to load data to a PostgreSQL database with Rails. We'll become over their pros and cons, so yous can choose the method that works best for your situation.

Postgres is an innovative database. According to a recent study by DB-Engines (PDF), PostgreSQL'due south popularity rating increased by 65 percent from January 2016–January 2019, while the rating of MySQL, SQL Server, and Oracle decreased past 10–sixteen percent during the same menstruation.

PostgreSQL has a strong reputation for treatment large information sets. Nonetheless, with the wrong tools and solutions, its powers can be undermined. So what'due south the fastest fashion to load information to a Postgres database in your Rails app? Let'south wait at four different methods, then we'll see which is the fastest.

  • Inserting one tape at a fourth dimension to load data to your Postgres database

    • Pros of single-row inserts with Postgres
    • Cons of single-row inserts with Postgres
  • Bulk Inserts with Active Tape Import to load information to your Postgres database

    • Pros of Bulk Inserts with Active Record in Ruby on Rails and Postgres
    • Cons of Bulk Inserts with Active Record in Ruby on Rails and Postgres
  • Using PostgreSQL Re-create with Activerecord-copy to load information to your Postgres database

    • Pros of using PostgreSQL Copy with Activerecord-copy
    • Cons of using PostgreSQL Copy with Activerecord-copy
  • 4. Using groundwork jobs to load information to your Postgres database

  • Final Thoughts Nearly Loading Large Information Sets into a PostgreSQL Database with Runway

  • Speed comparison of different means to load data into Postgres with Rails

  • Other articles and resources you might like

Diagram of Ruby on Rails Insert Methods

Inserting ane record at a time to load information to your Postgres database

I easy way to load data to a Postgres database is to loop through the data and insert them ane at a time.

Here's a sample code to do this in Rails, assuming we have the source data in a CSV file:

                          # lib/tasks/one_record_at_a_time.rake              crave              'csv'              require              "benchmark"              namespace              :import              do              desc              "imports data from csv to postgresql"              task              :single_record              =              >              :surroundings              practise              #This part loops over the content of the csv file and creates a new record for each of them.              def                              insert_user                            CSV              .foreach(filename,              headers:              true              )              practice              |row|              User              .create(row)              terminate              end              puts              Benchmark              .realtime              {insert_user              }              #Here we are using criterion to measure out the speed              end              end                      

But there's a problem with this arroyo. Inserting data 1 at a fourth dimension into a PostgreSQL database is extremely tedious. I ran this Rake job to insert over a one thousand thousand records and measured it with Benchmark. The report came back with a result of over 1.3 hours, that's a long time. There'southward overhead in both the database and the awarding in processing rows one-by-one, and additional latency in waiting for the database circular trip for each row.

We'll run across a better arroyo in the next section, but for at present, here's a summary of the pros and cons of single-row inserts:

Pros of unmarried-row inserts with Postgres

  • Doesn't crave an external dependency

Cons of unmarried-row inserts with Postgres

  • Very ho-hum
  • Might lock your session for a long time
  • Non suitable for inserting large information sets
  • If one insert fails, yous're stuck with partially loaded data

Bulk Inserts with Agile Record Import to load data to your Postgres database

Running a bulk insert query is a better and faster way to load information into your Postgres database, and the Runway jewel activerecord-import makes it like shooting fish in a barrel to load massive information in bulk in a way that the Active Record ORM tin sympathize and dispense.

Instead of hitting your database multiple times, processing transactions, and doing all the back and forth with your app and database, the Active Record Import gem allows you to build upward large insert queries and run them at once.

Yous can install the Agile Record Import gem by calculation gem 'activerecord-import' to your Gemfile and running bundle install in your terminal. This gem adds import to Active Record classes. That means you'll just need to call the import method on your model classes to load the data into your database.

Here is an example:

                          # lib/tasks/active_record_import.rake              crave              'csv'              crave              "benchmark"              namespace              :import              do              desc              "imports data from csv to postgresql"              users              =              [              ]              task              :batch_record              =              >              :environment              do              CSV              .foreach(filename,              headers:              true              )              do              |row|              users              <              <              row              end              newusers              =              users.map              do              |attrs|              User              .              new              (attrs)              end              time              =              Benchmark              .realtime              {              User              .import(newusers)              }              puts time              end              terminate                      

Find how we're building upwardly the tape in an array—users—and passing the array to the import method on the User model— User.import(newusers).

That's actually all that needs to be done. However, you can choose to laissez passer but some specific columns and the values in an array to the import method if you desire to. For example, User.import columns values where the columns will exist an assortment like ["first_name", "last_name"], while the values will be an assortment like [ ['Peter', 'Joseph'], ['Banabas', 'Bob Jones'] ].

I analyzed loading a million records into a Postgres database with Track using this method, and information technology took simply 5.i minutes. Remember the offset method took 1.three hours? This method is ane,529% ( ~15x ) faster. That'south impressive.

Pros of Bulk Inserts with Agile Record in Crimson on Rails and Postgres

  • Follows Active Record Associations, meaning Rails ORM is able to do its magic with the loaded data
  • Faster to load information into your PostgreSQL database
  • Doesn't have per-row overhead
  • If insert fails, your transaction will rollback the insert

Cons of Bulk Inserts with Active Record in Red on Rail and Postgres

  • The activerecord-import gem might conflict with other gems that add .import method to the Active Record model. Notwithstanding, in cases where this might happen, you lot can utilize the .bulk_import method besides attached to your model classes as an alternative.

See how batch import improved our speed by over 1,529%? That was incredible, right? There is withal a faster way to load data to a Postgres database.

Download Free eBook: Efficient Search in Rails with Postgres

Using PostgreSQL Re-create with Activerecord-re-create to load data to your Postgres database

COPY is the fastest way to load data to a PostgreSQL database; it uses the combined power of a majority insert and avoids some of the overhead of repeatedly parsing and planning an INSERT.

The gem activerecord-copy provides an easy-to-utilise interface for implementing Copy in your Rails app. You'll need to add together the line gem 'activerecord-import' to your Gemfile and run bundle install in your terminal to install the gem and go ready to use it.

Here is a sample Rake task showing how yous can utilise information technology:

                          # lib/tasks/active_record_copy.rake              require              'csv'              require              "benchmark"              namespace              :re-create              exercise              desc              "imports data from csv to postgresql"              task              :data              =              >              :environs              do              def                              insert_user                            users              =              [              ]              CSV              .foreach(filename,              headers:              truthful              )              exercise              |row|              users              <              <              row              end              time              =              Time              .now.getutc              User              .copy_from_client              [              :first_name              ,              :last_name              ,              :electronic mail              ,              :created_at              ,              :updated_at              ]              practise              |re-create|              users.              each              do              |d|              copy              <              <              [d[              :first_name              ]              ,              d[              :last_name              ]              ,              d[              :electronic mail              ]              ,time,              fourth dimension              ]              end              stop              end              puts              Criterion              .realtime              {insert_user}              end              finish                      

The activerecord-copy jewel adds a copy_from_client method to all your model classes, as shown in the snippet to a higher place (yous'll accept to ascertain the columns and their values as shown).

Annotation that when you lot use the activerecord-copy gem, the time stamp is not created for y'all automatically. Y'all'll have to create this yourself. Yous'll also notice where I created the fourth dimension postage time = Fourth dimension.at present.getutc; that's because Rails will not create time stamps for you lot automatically with Re-create.

Pros of using PostgreSQL Copy with Activerecord-re-create

  • Doesn't have per-row overhead
  • If insert fails, your transaction will rollback the insert
  • Super fast

Cons of using PostgreSQL Re-create with Activerecord-copy

  • Manually set time stamps (created_at, updated_at, etc.)

I analyzed the activerecord-re-create functioning with a transaction of over one million records, as I did for other methods, and the speed is about ane.5 minutes. Insanely fast compared to the other methods we've seen in this article.

4. Using groundwork jobs to load information to your Postgres database

If you oftentimes load new data to your database, 1 neat way to improve your app'southward operation is to run your information loading using a background job. There are several tools that brand this possible, for case, Rails' delayed_job gem, sidekiq, and resque.

Still, just like Active Tape, Runway uses Active Jobs to allow yous to use any of these supported adapters inside your Rails app without bothering almost job-specific implementation. So yous could set up a script for Agile Record and run the script in a background job using Active Jobs and the delayed_job adapter. That mode, you lot'll exist running your data loading in the background.

Let's walk through how to set up your Active Task to run your background process:

  1. Since you're going to use the delayed_job adapter, install the delayed_job_active_record precious stone.
  2. Add gem 'delayed_job_active_record' to your Gemfile.
  3. Run bundle install on your terminal/command line.
  4. Run the post-obit command to create a delayed task migration for the delayed jobs tabular array:
            rail g delayed_job:active_record rake db:drift          
  1. Generate an Active Chore by running the following control:
            rails generate job import_data          
  1. Open up the file created in your app/jobs directory—app/jobs/import_data_job.rb—and add your information loading code:
                          # app/jobs/import_data_job.rb              form              ImportDataJob              <              ApplicationJob              queue_as              :default              def                              perform                            (              *args)              # Write your lawmaking hither to load records to the database. Y'all tin use any of the fast methods we've discussed.              end              end                      
  1. In order for Rails to be aware of the Active Job adapter you lot want to utilise, yous need to add the adapter to your config file. Simply add together this line: config.active_job.queue_adapter = :delayed_job_active_record.
                          # config/awarding.rb              module              YourApp              grade              Application              <              Rails              :              :              Application              # Be sure to have the adapter's gem in your Gemfile              # and follow the adapter'southward specific installation              # and deployment instructions.              config.active_job.queue_adapter              =              :delayed_job_active_record              finish              end                      

Depending on how often you want the job to run, you can gear up the job to be enqueued at a specific time or immediately, following the instructions in the Active Jobs documentation.

Ane fashion yous tin can do this is to allow the job to run asynchronously. Create a Rake chore, add together ImportDataJob.perform_later to the task, and run it. Example:

            namespace              :active_jobs              do              desc              "imports data from sql to postgresql"              job              :import              =              >              :environment              practice              ImportDataJob              .perform_later              end              finish                      

In one case this is done, yous tin at present run the task rake active_jobs:import on your final.

Final Thoughts About Loading Large Data Sets into a PostgreSQL Database with Rails

When considering how to optimize your database performance, it's best to outset figure out the optimization options the database has already provided. Equally you may have noticed, most of the tools and techniques in this article leverage the hidden power of the PostgreSQL database. Sometimes, it might merely be your implementation slowing down your database performance.

Speed comparison of dissimilar ways to load information into Postgres with Rails

Here's a tabular array summarizing the various speeds of the methods discussed in this article.

Method Speed Corporeality of records
One record at a time insert 1.3 hours i,000,000
Majority inserts with Activerecord Import five.1 minutes 1,000,000
PostgreSQL Copy with Activerecord-copy 1.v minutes 1,000,000
Using Background Jobs < one sec (perceived) one,000,000

You've learned that if you're loading a huge amount of data into your PostgreSQL database, one insert at a time is slow and shouldn't even exist considered. For ultimate operation, you want to utilise COPY. Of course, you've as well seen the caveats of each method, and you should weigh all the pros and cons before making your final conclusion.

Share this article: If you lot liked this article we'd appreciate it if you'd tweet it to your peers.

Using Postgres Row-Level Security in Cerise on Runway

Creating Custom Postgres Data Types in Runway

Efficient Search in Rails with Postgres (PDF eBook)

PostGIS vs. Geocoder in Runway

Avant-garde Active Record: Using Subqueries in Runway

Full Text Search in Milliseconds with Rail and PostgreSQL

Effectively Using Materialized Views in Ruby on Rails

Similarity in Postgres and Runway using Trigrams

Efficient GraphQL queries in Ruby on Rails & Postgres

Sign up for the pganalyze newsletter

Receive infrequent emails well-nigh interesting Postgres content around the web, new pganalyze feature releases, and new pganalyze ebooks. No spam, we hope.

How To Add To A Database With Activerecord,

Source: https://pganalyze.com/blog/fastest-way-importing-data-into-postgres-with-ruby-rails

Posted by: parkerthavercuris.blogspot.com

0 Response to "How To Add To A Database With Activerecord"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel