Front-End Web & Mobile

Get insights into your mobile app usage patterns using Amazon Mobile Analytics and R

Update: April 30, 2018: We’ve discontinued Amazon Mobile Analytics. Amazon Pinpoint now provides the analytics features that Amazon Mobile Analytics previously offered. If you’re new to Mobile Analytics, you can integrate the mobile analytics features of Amazon Pinpoint into your app. If you currently use Amazon Mobile Analytics, you can migrate to Amazon Pinpoint.


The key to improving user engagement with your mobile app is to understand users’ behavior in the app and then to optimize experiences based on those insights. Finding meaningful patterns in the app event data can be challenging, however, and often standard KPI’s such as monthly active users (MAU) or daily active users(DAU) don’t portray the complete picture. For example, understanding the distribution of number of app opens by a user in the last 30 days gives you insight into the usage pattern of the users of your app as opposed to MAU which only gives a count of distinct users who have used the app. With the ability to perform custom analysis of app usage events and metrics that are important to your app, you can make better decisions about how to modify your app experiences to improve user engagement. 

Amazon Mobile Analytics recently launched the Auto Export feature, which exports event data from your apps to Amazon S3 and Amazon Redshift. With your data in these sources, you can now leverage the power of external tools such as RTableau & SQL Workbench for data exploration and to produce meaningful insights and visualizations. In today’s blog post, I will go through the steps required to connect R (free software environment for statistical computing and graphics) with Amazon Mobile Analytics data in Amazon Redshift. I will also provide sample code to query Amazon Redshift data in R and perform basic analysis. 

First, you will need to install  R  Also, if you haven’t already done it, turn on the Amazon Mobile Analytics Auto Export feature. You’ll find instructions here.

Step 1:

Open R, and using the R command prompt, install the RPostgreSQL package :

 install.packages("RPostgreSQL")  

Step 2:

Use the library function to load and access the package:

  library(RPostgreSQL)  

Step 3:

Use these commands to connect R to Amazon Redshift:

postgressdriver <- dbDriver("PostgreSQL")

redshift_connect<- dbConnect(postgressdriver
,host="<<ENTERHOSTDETAILSHERE>>"
,port="<<ENTERPORTDETAILSHERE>>"
,dbname="<<ENTERDBNAMEHERE>>"
,user="<<ENTERUSERNAMEHERE>>"
,password="<<ENTERPASSWORDHERE>>")  

Step 4:

Use the following commands to query the data in Amazon Redshift and load it into a data frame in R:

 sampledata <- dbGetQuery(redshift_connect,   
 "Select   
    application_app_id AS "app id",  
    COUNT(DISTINCT client_cognito_id) AS "users"   
  from   
     AWSMA.event   
  WHERE  
     event_type = '_session.start' AND  
     event_timestamp BETWEEN getdate() - 30 AND getdate() + 1  
  GROUP BY  
     application_app_id ")  

Now let’s look at some use cases and code.

Use case: Analyze the distribution of app opens by customers in the last 7 days 

postgressdriver <- dbDriver("PostgreSQL")  

con <- dbConnect(postgressdriver
,host="<<ENTERHOSTDETAILSHERE>>"
,port="<<ENTERPORTDETAILSHERE>>"
,dbname="<<ENTERDBNAMEHERE>>"
,user="<<ENTERUSERNAMEHERE>>"
,password="<<ENTERPASSWORDHERE>>") 
 
df <- dbGetQuery(con,  
  "SELECT   
      client_cognito_id AS "users",   
      COUNT(*) AS "appopens"   
   FROM    
      AWSMA.v_event   
   WHERE   
      event_type = '_session.start' AND   
      event_timestamp BETWEEN getdate() - 7 AND getdate() + 1   
   GROUP BY   
      client_cognito_id ")  

##plot the histogram

hist(df$appopens,main="App opens distribution"
,xlab="Number of app opens in the last 7days"
,ylab="Number of players"
,breaks=seq(0,14,1) 
,col="green")   

 

Use case: Understand the distribution of active users in the last 30 days by different platforms

postgressdriver <- dbDriver("PostgreSQL")
  
con <- dbConnect(postgressdriver
,host="<<ENTERHOSTDETAILSHERE>>"
,port="<<ENTERPORTDETAILS HERE>>"
,dbname="<<ENTERDBNAMEHERE>>"
,user="<<ENTERUSERNAMEHERE>>"
,password="<<ENTERPASSWORDHERE>>")
  
df <- dbGetQuery(con,  
 "SELECT   
     device_platform_name AS "platform",   
     COUNT(DISTINCT client_cognito_id) AS "active_users"   
  FROM    
     AWSMA.v_event   
  WHERE   
     event_type = '_session.start' AND   
     event_timestamp BETWEEN getdate() - 30 AND getdate() + 1   
  GROUP BY 
     device_platform_name") 
 
## plot a pie chart

pct <- round(df$active_users/sum(df$active_users)*100)  
lbls <- paste(df$platform, pct) # add percents to labels   
lbls <- paste(lbls,"%",sep="") # ad % to labels   
pie(df$active_users, labels = lbls
, main="Distribution of active users across platforms"  
,col=rainbow(length(lbls)))   

Conclusion

By exporting the event data from your apps, you can leverage analytics tools to extract insights that help you understand and grow your mobile app business. Using Amazon Redshift, you can query your data directly with SQL queries, and as we’ve shown in this post, you can use R to extract visualizations easily. We’d love to hear how you’re using Amazon Mobile Analytics and Amazon Redshift to get the most out of your data. Feel free to post your suggestions or new feature/blog requests in the comments or in our forum.