Pulling Shopify Data Into R With shopr
In this tutorial I’ll show you how to use the shopr package to fetch data from a Shopify shop into R. (Not to be confused with the shopifyr package.)
Disclaimer I’m the author of shopr.
Setup
Install shopr
The first thing you should do is install the shopr package. I haven’t tried submitting it to CRAN (perhaps I will in the future), so for now you need to install it directly from github. I recommend using devtools for this.
install.packages("devtools")
devtools::install_github("ben519/shopr")
Create A Shopify Partner Account
shopr reads data via the showily admin API. As such, you need to create a shopify partner account in order to get access to the API. From the docs,
The Shopify Partner program includes developers, designers, and experts who build on the Shopify platform. You can sign up for a free Shopify Partner account by registering here.
Create A Development Store
This step isn’t necessary, but it’s recommended to create a development store. A development store is a fake store you fill with fake customers, orders, products, etc. to tinker with before you start working with real, live stores.
Generate API Credentials From The Shopify Admin
The last setup step is to generate API credentials from the shopify admin. This is done within your shop’s admin.
Start by navigating to your shop’s admin page. Then click on Apps.
Next, click on Manage private apps.
Then Create a new private app.
Fill out the form with your app name and email address. shopr only reads data, so choose “Read access” as opposed to “Read and write access”. Also, shopr doesn’t use webooks, so the choice for Webhook API version is unimportant. Hit Save.
Lastly, click Review disabled Admin API permissions and grant read-access for Orders, Locations, Inventory, and Products. Also take note of your API key and Password.
Show Me How To Pull Data Already
Once you have an API key and Password, pulling data is as easy as
library(shopr)
ordrs <- shopr_get_orders(
shopURL = "https://my-test-store.myshopify.com",
APIKey = "abc123",
APIPassword = "def456"
)
The result of this is a big list of data.tables. If don’t know what a data.table is, it’s like a data.frame on steroids. Check out the data.table package for more details. If data.tables make you feel uncomfortable, you can always cast them back to a normal data.frame using as.data.frame()
.
In this case, my call to shopr_get_orders()
returned 8 data.tables shown below. (Most columns have been excluded for the sake of brevity.)
ordrs
## $orders
## id order_number email created_at
## 1: 3370957763 1001 fred@gmail.com 2016-07-23T17:38:04-04:00
## 2: 3411441347 1002 bill@gmail.com 2016-07-29T14:35:55-04:00
## 3: 3411467139 1003 jane@yahoo.com 2016-07-29T14:39:04-04:00
## 4: 3432027523 1004 ssmith@gmail.com 2016-08-01T20:15:43-04:00
## 5: 3432047747 1005 jblow@gmail.com 2016-08-01T20:19:41-04:00
## 6: 3493870211 1006 bgorman@GormAnalysis.com 2016-08-10T17:41:24-04:00
## 7: 3494068227 1007 ryan@gmail.com 2016-08-10T18:06:33-04:00
## 8: 3494073347 1008 bgorman@GormAnalysis.com 2016-08-10T18:07:22-04:00
## processed_at total_price total_tax
## 1: 2016-07-23T17:38:04-04:00 117.75 9.75
## 2: 2016-07-29T14:35:55-04:00 228.50 20.50
## 3: 2016-07-29T14:39:04-04:00 9.10 0.10
## 4: 2016-08-01T20:15:43-04:00 109.00 0.00
## 5: 2016-08-01T20:19:41-04:00 310.00 0.00
## 6: 2016-08-10T17:41:24-04:00 62.02 5.02
## 7: 2016-08-10T18:06:33-04:00 226.30 20.30
## 8: 2016-08-10T18:07:22-04:00 62.58 5.08
##
## $discount_applications
## .id type value value_type allocation_method target_selection
## 1: 6 discount_code 1.0 fixed_amount across all
## 2: 7 discount_code 2.0 fixed_amount each entitled
## 3: 8 discount_code 0.5 fixed_amount each entitled
## target_type code order_id
## 1: line_item bluesale 3493870211
## 2: line_item table1 3494068227
## 3: line_item disc1 3494073347
##
## $discount_codes
## .id code amount type order_id
## 1: 6 bluesale 1.00 fixed_amount 3493870211
## 2: 7 table1 2.00 fixed_amount 3494068227
## 3: 8 disc1 0.50 fixed_amount 3494073347
##
## $tax_lines
## price rate title price_set.shop_money.amount
## 1: 5.00 0.0500 LA State Tax 5.00
## 2: 4.75 0.0475 Covington Municipal Tax 4.75
## 3: 10.00 0.0500 LA State Tax 10.00
## 4: 10.50 0.0525 Mandeville Municipal Tax 10.50
## 5: 0.05 0.0500 LA State Tax 0.05
## 6: 0.05 0.0525 Mandeville Municipal Tax 0.05
## 7: 2.45 0.0500 LA State Tax 2.45
## 8: 2.57 0.0525 Mandeville Municipal Tax 2.57
## 9: 9.90 0.0500 LA State Tax 9.90
## 10: 10.40 0.0525 Mandeville Municipal Tax 10.40
## 11: 2.48 0.0500 LA State Tax 2.48
## 12: 2.60 0.0525 Mandeville Municipal Tax 2.60
## price_set.shop_money.currency_code order_id
## 1: USD 3370957763
## 2: USD 3370957763
## 3: USD 3411441347
## 4: USD 3411441347
## 5: USD 3411467139
## 6: USD 3411467139
## 7: USD 3493870211
## 8: USD 3493870211
## 9: USD 3494068227
## 10: USD 3494068227
## 11: USD 3494073347
## 12: USD 3494073347
##
## $line_items
## id order_id variant_id product_id quantity sku
## 1: 6170376643 3370957763 24320788803 7668983171 1
## 2: 6253411459 3411441347 24320788803 7668983171 2
## 3: 6253465027 3411467139 24935166595 7758137923 1
## 4: 6293953731 3432027523 24935166595 7758137923 1
## 5: 6293953795 3432027523 24320788803 7668983171 1
## 6: 6293988675 3432047747 24935166595 7758137923 2
## 7: 6293988739 3432047747 25091764739 7668983171 1
## 8: 6293988803 3432047747 25091764803 7668983171 1
## 9: 6293988867 3432047747 24320788803 7668983171 1
## 10: 6419054723 3493870211 25557932419 7923756931 1
## 11: 6419460163 3494068227 25557932547 7923756931 1
## 12: 6419460227 3494068227 25557932419 7923756931 1
## 13: 6419460291 3494068227 24320788803 7668983171 1
## 14: 6419469763 3494073347 25557932419 7923756931 1
## title variant_title
## 1: Cool Painting
## 2: Cool Painting
## 3: Sweet Picture of GormAnalysis Home Page
## 4: Sweet Picture of GormAnalysis Home Page
## 5: Cool Painting
## 6: Sweet Picture of GormAnalysis Home Page
## 7: Cool Painting medium / blue
## 8: Cool Painting medium / green
## 9: Cool Painting small / red
## 10: New Prod small
## 11: New Prod large
## 12: New Prod small
## 13: Cool Painting small / red
## 14: New Prod small
##
## $fulfillments
## id order_id status created_at service location_id
## 1: 2762536579 3370957763 success 2016-07-29T16:03:15-04:00 manual 8571779
## 2: 2836684995 3411441347 success 2016-08-10T17:42:35-04:00 manual 8571779
## 3: 2836685059 3411467139 success 2016-08-10T17:42:35-04:00 manual 8571779
## 4: 2836685251 3432027523 success 2016-08-10T17:42:36-04:00 manual 8571779
## 5: 2836685315 3432047747 success 2016-08-10T17:42:36-04:00 manual 8571779
## 6: 2836685507 3493870211 success 2016-08-10T17:42:36-04:00 manual 8571779
##
## $refunds
## Empty data.table (0 rows and 1 cols): order_id
##
## $shipping_lines
## id order_id title price code source
## 1: 2757915779 3370957763 Standard Shipping 8.00 Standard Shipping shopify
## 2: 2788977603 3411441347 Standard Shipping 8.00 Standard Shipping shopify
## 3: 2788996355 3411467139 Standard Shipping 8.00 Standard Shipping shopify
## 4: 2803821123 3432027523 Standard Shipping 8.00 Standard Shipping shopify
## 5: 2803837635 3432047747 Standard Shipping 8.00 Standard Shipping shopify
## 6: 2850898499 3493870211 Standard Shipping 8.00 Standard Shipping shopify
## 7: 2851056899 3494068227 Standard Shipping 8.00 Standard Shipping shopify
## 8: 2851060931 3494073347 Standard Shipping 8.00 Standard Shipping shopify
WARNING: If your shop has a lot of orders, you’ll probably want to pass additional parameters to shopr_get_orders()
in order to subset the orders you’re requesting. For example, you can get orders created within the last 7 days.
ordrs <- shopr_get_orders(
shopURL = "https://my-test-store.myshopify.com",
APIKey = "abc123",
APIPassword = "def456",
created_at_min = Sys.time() - 60*60*24*7,
created_at_max = NULL
)
Or you could pull orders that haven’t been fulfilled.
ordrs <- shopr_get_orders(
shopURL = "https://my-test-store.myshopify.com",
APIKey = "abc123",
APIPassword = "def456",
fulfillment_status = "null"
)
The functions shopr_get_inventory_items()
, shopr_get_inventory_levels()
, shopr_get_locations()
, and shopr_get_products()
behave almost identically to shopr_get_orders()
, but with slightly different parameters.
Pagination
If you request something with a large result-set, shopr will automatically use pagination (as required by Shopify) to pull the data in chunks and then stitch the results together. For example, if you request orders in the recent 30 days, and this turns out to be 500 orders, shopr will fetch two chunks of 250 orders (the max chunk size allowed by Shopify). The parameters limit_per_page
and max_pages
control the pagination settings, but their default values should be ideal. Additionally, shopr makes requests as quickly as it can without breaching your Shopify API request quota.