Data Requirements

What Kind Of Data Do We Need?

User Events

Events that a user does on a mobile app or website geo-tagged in order to understand area-level funnel conversions Data.

Mandatory Columns in User Events Data:

  • user_id

  • user_lat

  • user_long

  • timestamp

  • event_name

The event can have other attributes associated with it, but they are not mandatory.

Use Cases with User Events: Understanding user behavior such as installs, searches, bookings, user churn, revenue, cancellations, conversation rate— across geo, time, and categories.

Supply Events

Rider status along with lat-long and timestamp at a fixed frequency. This powers analysis on time spent by riders in different locations.

Mandatory Columns in Rider Pings Data:

  • rider_id

  • rider_lat

  • rider_long

  • timestamp

  • rider_status

The rider can have other attributes associated with it, but they are not mandatory.

Use Cases with Rider Pings: Understanding rider behavior such as idle time, available time, active hours, incentives, utilization—across geo, time, and categories.

Trips/Orders:

The start and end location along with other attributes like revenue, type of trip, etc.

Mandatory Columns in Trips/Order Data:

  • trip_id

  • trip_start_lat

  • trip_start_long

  • trip_start_time

  • trip_end_time

  • trip_end_lat

  • trip_end_long

The trip can have other attributes associated with it, but they are not mandatory.

Use Cases with Trips: Understanding trip behaviour such as trip starts, trip ends, route characteristics. For example: revenue of routes, long-distance routes, routes with maximum trips. Origins of power users. Destinations where it takes maximum time to reach.

Restaurants/Stations:

Location of a restaurant along with a unique ID that can be referenced.

Mandatory Columns in Restaurants/Stations Data:

  • store_idMandatory Columns in Restaurants/Stations Data:

The trip can have other attritbutes associated with it, but they are not mandatory.

Use Cases with Restaurants: Understanding trip behavior such as trip starts, trip ends, route characteristics

Note: The columns in the sample are indicative. Each of the column specified in schema while uploading can be used as a property for aggregation or as a filter while computing a metric.

Data Constraints

Column/Field Names

You can have as many fields as you want in your data set but, we do put a constraint on the length of the field names. We recommend that your field names be no longer than 32 characters. Anything larger than 32 characters might cause errors while uploading the data.

Data Types

We support three basic data formats.

  1. String - All text type data can be marked as strings. Note that numeric data that you want to treat as text should be marked as string while configuring your data source.

  2. Number - Number encompasses all the integer and floating point type data.

  3. DateTime - DateTime fields are special fields that hold information about time and date. There are specific constraints related to this kind of data and they are mentioned in the section below.

Data format

The data we expect should be in one of the format

  • csv

  • json

  • parquet

Schema Constraint

The schema of the data should contain the following columns:

  • Primary key for the data

  • Update timestamp for the column

  • Lat column

  • Long column

DateTime Field Constraints

DateTime data is a special kind of string data that holds information regarding date and time. These strings are represented by special formats and we support a handful of them so in order to avoid errors while integrating, make sure your DateTime fields have the following formats.

  • yyyy/mm/dd hh:mm:ss.f

  • yyyy/mm/dd hh:mm:ss

  • yyyy-mm-dd hh:mm:ss.f

  • yyyy-mm-dd hh:mm:ss

  • yyyy/mm/dd

  • yyyy-mm-dd

  • epoch

Latitude and Longitude Fields

Location data is at the heart of locale.ai 's functionality. Latitude and Longitude are two fields that together form a location context. Although there are multiple ways to represent latitude and longitude information but, we currently support only real number latitude and longitude with ranges -90 to +90 for latitude and -180 to +180 for longitude.

Last updated