Data Requirements
What Kind Of Data Do We Need?
User Events
Events that a user does on a mobile app or website geo-tagged in order to understand area-level funnel conversions Data.
Mandatory Columns in User Events Data:
user_id
user_lat
user_long
timestamp
event_name
The event can have other attributes associated with it, but they are not mandatory.
Use Cases with User Events: Understanding user behavior such as installs, searches, bookings, user churn, revenue, cancellations, conversation rate— across geo, time, and categories.
Supply Events
Rider status along with lat-long and timestamp at a fixed frequency. This powers analysis on time spent by riders in different locations.
Mandatory Columns in Rider Pings Data:
rider_id
rider_lat
rider_long
timestamp
rider_status
The rider can have other attributes associated with it, but they are not mandatory.
Use Cases with Rider Pings: Understanding rider behavior such as idle time, available time, active hours, incentives, utilization—across geo, time, and categories.
Trips/Orders:
The start and end location along with other attributes like revenue, type of trip, etc.
Mandatory Columns in Trips/Order Data:
trip_id
trip_start_lat
trip_start_long
trip_start_time
trip_end_time
trip_end_lat
trip_end_long
The trip can have other attributes associated with it, but they are not mandatory.
Use Cases with Trips: Understanding trip behaviour such as trip starts, trip ends, route characteristics. For example: revenue of routes, long-distance routes, routes with maximum trips. Origins of power users. Destinations where it takes maximum time to reach.
Restaurants/Stations:
Location of a restaurant along with a unique ID that can be referenced.
Mandatory Columns in Restaurants/Stations Data:
store_idMandatory Columns in Restaurants/Stations Data:
The trip can have other attritbutes associated with it, but they are not mandatory.
Use Cases with Restaurants: Understanding trip behavior such as trip starts, trip ends, route characteristics
Note: The columns in the sample are indicative. Each of the column specified in schema while uploading can be used as a property for aggregation or as a filter while computing a metric.
Data Constraints
Column/Field Names
You can have as many fields as you want in your data set but, we do put a constraint on the length of the field names. We recommend that your field names be no longer than 32 characters. Anything larger than 32 characters might cause errors while uploading the data.
Data Types
We support three basic data formats.
String - All text type data can be marked as strings. Note that numeric data that you want to treat as text should be marked as string while configuring your data source.
Number - Number encompasses all the integer and floating point type data.
DateTime - DateTime fields are special fields that hold information about time and date. There are specific constraints related to this kind of data and they are mentioned in the section below.
Data format
The data we expect should be in one of the format
csv
json
parquet
Schema Constraint
The schema of the data should contain the following columns:
Primary key for the data
Update timestamp for the column
Lat column
Long column
DateTime Field Constraints
DateTime data is a special kind of string data that holds information regarding date and time. These strings are represented by special formats and we support a handful of them so in order to avoid errors while integrating, make sure your DateTime fields have the following formats.
yyyy/mm/dd hh:mm:ss.f
yyyy/mm/dd hh:mm:ss
yyyy-mm-dd hh:mm:ss.f
yyyy-mm-dd hh:mm:ss
yyyy/mm/dd
yyyy-mm-dd
epoch
Latitude and Longitude Fields
Location data is at the heart of locale.ai 's functionality. Latitude and Longitude are two fields that together form a location context. Although there are multiple ways to represent latitude and longitude information but, we currently support only real number latitude and longitude with ranges -90 to +90 for latitude and -180 to +180 for longitude.
Last updated