Features API
The Features API provides features for ML Models across all types of demand causal factors, including attended events and non-attended events.
It allows you to go straight to feature-importance testing and improving your models rather than having to worry about first building a data lake and then aggregating the data.
Features API is new and features will be expanded and added to over time. The endpoint currently supports:
Basic Concepts
Every request to the API must specify which features are needed. Each request must specify a date range and location (that applies to all features in that request). Certain groups of features support additional filtering/parameters. Results are at the daily level.
Response Format
It is possible to specify the format of the response by passing the appropriate value in the Accept
header, valid values are application/json
or text/csv
.
Range filter constraint
Each request can currently fetch up to 90 days worth - for longer date ranges, multiple requests must be made and we have some examples of how to do that in this notebook. There is no pagination in this API.
Common base and constraints
At minimum the following needs to be provided in the POST request:
Active start date
Active end date
location - Geo-location and radius or list of placeId
These will apply globally across all requested features as the base filter criteria.
Range filter constraint
The calculated active data range may not exceed 90 days
Constraint on list of PlaceIDs
If more than 1 PlaceID is specified the response will detail the aggregated stats across the multiple (up to 3) PlaceID specified
Category Features
Each request to the Features API has to specify at least one or more category features. These features are currently based on attendance, viewership, rank or impact:
see PHQ Attendance
see PHQ Viewership
see PHQ Rank
see PHQ Impact
Response
The Features API returns a timeseries response containing an entry for each day across the requested active date range for evaluation. The response structure for PHQ Attendance, and PHQ Viewership, differs from PHQ Rank in that the former return the calculated statistics requested based on PHQ Attendance/PHQ Viewership, and the latter returns a Rank range histogram of event counts.
PHQ Attendance response
Example response entry:
"phq_attendance_concerts": {
"stats": {
"count": 16,
"sum": 4972
}
}
The above is returned for each attendance category feature requested. The stats
will contain the calculated values for all statistical measures requested in the criteria.
PHQ Viewership response
Example response entry:
"phq_viewership_sports_american_football_nfl": {
"stats": {
"count": 21,
"sum": 5643
}
}
The above is returned for each viewership category feature requested. The stats
will contain the calculated values for all statistical measures requested in the criteria.
PHQ Rank response
Example response entry:
"phq_rank_public_holidays": {
"rank_levels": {
"1": 0,
"2": 0,
"3": 0,
"4": 2,
"5": 0
}
}
The above is returned for each rank category feature requested. The rank_levels
will contain the calculated histogram of event counts based on rank range i.e.
0
<=
phq-rank<=
20 →Rank-Level: 1
20
<
phq-rank<=
40 →Rank-Level: 2
40
<=
phq-rank<=
60 →Rank-Level: 3
60
<=
phq-rank<=
80 →Rank-Level: 4
80
<=
phq-rank<=
100 →Rank-Level: 5
Request Features
Attendance based Feature Fields
These fields aggregate PHQ attendance value for events based on the specified attendance based category. An explicit feature key will be defined for each phq_attendance_{attendance based category}
combination! i.e.
community →
phq_attendance_community
concerts →
phq_attendance_concerts
conferences →
phq_attendance_conferences
expos →
phq_attendance_expos
festivals →
phq_attendance_festivals
performing-arts →
phq_attendance_performing_arts
sports →
phq_attendance_sports
academic → see Academic Attendance Features
Note on specifying features
If a category feature is not explicitly requested it WILL NOT be queried or evaluated
Field | Description |
---|---|
phq_attendance_community bool or Attendance Criteria optional excluded by default |
Attendance community feature.Example 1: "phq_attendance_community": true Example 2: "phq_attendance_community": {"stats":..., "phq_rank": {...}} |
phq_attendance_concerts bool or Attendance Criteria optional excluded by default |
Attendance concerts feature.Example 1: "phq_attendance_concerts": true Example 2: "phq_attendance_concerts": {"stats":..., "phq_rank": {...}} |
phq_attendance_conferences bool or Attendance Criteria optional |
Attendance conferences feature.Example 1: "phq_attendance_conferences": true Example 2: "phq_attendance_conferences": {"stats":..., "phq_rank": {...}} |
phq_attendance_expos bool or Attendance Criteria optional excluded by default |
Attendance expos feature.Example 1: "phq_attendance_conferences": true Example 2: "phq_attendance_expos": {"stats":..., "phq_rank": {...}} |
phq_attendance_festivals bool or Attendance Criteria optional excluded by default |
Attendance festivals feature.Example 1: "phq_attendance_festivals": true Example 2: "phq_attendance_festivals": {"stats":..., "phq_rank": {...}} |
phq_attendance_performing_arts bool or Attendance Criteria optional excluded by default |
Attendance performing-arts feature.Example 1: "phq_attendance_performing_arts": true Example 2: "phq_attendance_performing_arts": {"stats":., "phq_rank": {.}} |
phq_attendance_school_holidays bool or Attendance Criteria optional excluded by default |
Attendance school-holidays feature.Example 1: "phq_attendance_school_holidays": true Example 2: "phq_attendance_school_holidays": {"stats":..., "phq_rank": {...}} |
phq_attendance_sports bool or Attendance Criteria optional excluded by default |
Attendance sports feature.Example 1: "phq_attendance_sports": true Example 2: "phq_attendance_sports": {"stats":..., "phq_rank": {...}} |
Each feature category can have differing Attendance Criteria except for active date and location as this is defined globally.
Action
POST /v1/features/
Example 1
curl -X POST https://api.predicthq.com/v1/features \
-H "Accept: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
--data @<(cat <<EOF
{
"active": {
"gte": "2019-11-16",
"lte": "2019-11-30"
},
"location": {
"place_id": [
5224323,
5811704,
4887398
]
},
"phq_attendance_conferences": {
"stats": [
"min",
"max"
]
},
"phq_attendance_sports": true,
"phq_attendance_concerts": true,
"phq_rank_public_holidays": true
}
EOF
)
{
"results": [
{
"date": "2019-11-28",
"phq_attendance_concerts": {
"stats": {
"count": 4,
"sum": 1101
}
},
"phq_attendance_conferences": {
"stats": {
"count": 1,
"min": 11,
"max": 11
}
},
"phq_attendance_sports": {
"stats": {
"count": 0,
"sum": 0
}
},
"phq_rank_public_holidays": {
"rank_levels": {
"1": 0,
"2": 0,
"3": 0,
"4": 0,
"5": 1
}
}
},
{
"date": "2019-11-29",
"phq_attendance_concerts": {
"stats": {
"count": 51,
"sum": 43619
}
},
"phq_attendance_conferences": {
"stats": {
"count": 0,
"min": 0,
"max": 0
}
},
"phq_attendance_sports": {
"stats": {
"count": 3,
"sum": 46954
}
},
"phq_rank_public_holidays": {
"rank_levels": {
"1": 0,
"2": 0,
"3": 0,
"4": 2,
"5": 0
}
}
}
]
}
Viewership based Feature Fields
These fields aggregate PHQ viewership value for events based on the specified viewership based category. An explicit feature key will be defined for each phq_viewership_{type}_{viewership based type category}
combination. Currently the only type
that is covered is sports
. Available fields:
All sports →
phq_viewership_sports
All American Football →
phq_viewership_sports_american_football
American Football - NCAA Mens →
phq_viewership_sports_american_football_ncaa_men
American Football - NFL →
phq_viewership_sports_american_football_nfl
All Automotive Racing →
phq_viewership_sports_auto_racing
Automotive Racing - Indy Car →
phq_viewership_sports_auto_racing_indy_car
Automotive Racing - NASCAR →
phq_viewership_sports_auto_racing_nascar
All Baseball →
phq_viewership_sports_baseball
Baseball - MLB →
phq_viewership_sports_baseball_mlb
Baseball - NCAA Mens →
phq_viewership_sports_baseball_ncaa_men
All Basketball →
phq_viewership_sports_basketball
Basketball - NBA →
phq_viewership_sports_basketball_nba
Basketball - NCAA Mens →
phq_viewership_sports_basketball_ncaa_men
Basketball - NCAA Womens →
phq_viewership_sports_basketball_ncaa_women
All Boxing →
phq_viewership_sports_boxing
All Golf →
phq_viewership_sports_golf
Golf - Masters →
phq_viewership_sports_golf_masters
Golf - PGA Championships →
phq_viewership_sports_golf_pga_championship
Golf - PGA Tours →
phq_viewership_sports_golf_pga_tour
Golf - US Open →
phq_viewership_sports_golf_us_open
All Horse Racing →
phq_viewership_sports_horse_racing
Horse Racing - Belmont Stakes →
phq_viewership_sports_horse_racing_belmont_stakes
Horse Racing - Kentucky Derby →
phq_viewership_sports_horse_racing_kentucky_derby
Horse Racing - Preakness Stakes →
phq_viewership_sports_horse_racing_preakness_stakes
All Ice Hockey →
phq_viewership_sports_ice_hockey
Ice Hockey - NHL →
phq_viewership_sports_ice_hockey_nhl
All Mixed Martial Arts →
phq_viewership_sports_mma
Mixed Martial Arts - UFC →
phq_viewership_sports_mma_ufc
All Soccer →
phq_viewership_sports_soccer
Soccer - CONCACAF Champions League →
phq_viewership_sports_soccer_concacaf_champions_league
Soccer - CONCACAF Gold Cup →
phq_viewership_sports_soccer_concacaf_gold_cup
Soccer - COPA America Mens →
phq_viewership_sports_soccer_copa_america_men
Soccer - FIFA World Cup Womens →
phq_viewership_sports_soccer_fifa_world_cup_women
Soccer - FIFA World Cup Mens →
phq_viewership_sports_soccer_fifa_world_cup_men
Soccer - MLS →
phq_viewership_sports_soccer_mls
Soccer - UEFA Champions League Mens →
phq_viewership_sports_soccer_uefa_champions_league_men
All Softball →
phq_viewership_sports_softball
Softball - NCAA Womens →
phq_viewership_sports_softball_ncaa_women
All Tennis →
phq_viewership_sports_tennis
Tennis - US Open →
phq_viewership_sports_tennis_us_open
Tennis - Wimbledon →
phq_viewership_sports_tennis_wimbledon
Note on specifying features
If a category feature is not explicitly requested it WILL NOT be queried or evaluated
Example 2
curl -X POST https://api.predicthq.com/v1/features \
-H "Accept: application/json" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
--data @<(cat <<EOF
{
"active": {
"gte": "2019-11-16",
"lte": "2019-11-17"
},
"location": {
"geo": {
"lon": -71.49978,
"lat": 41.75038,
"radius": "30mi"
}
},
"phq_attendance_conferences": {
"stats": [
"min",
"max"
]
},
"phq_attendance_sports": {
"stats": ["count", "std_dev", "median"],
"phq_rank": {
"gt": 50
}
},
"phq_attendance_concerts": true,
"phq_rank_public_holidays": true
}
EOF
)
{
"results": [
{
"date": "2019-11-16",
"phq_attendance_concerts": {
"stats": {
"count": 16,
"sum": 4972
}
},
"phq_attendance_conferences": {
"stats": {
"count": 1,
"min": 1500,
"max": 1500
}
},
"phq_attendance_sports": {
"stats": {
"count": 4,
"median": 2859.5,
"std_dev": 3189.0008231419447
}
},
"phq_rank_public_holidays": {
"rank_levels": {
"1": 0,
"2": 0,
"3": 0,
"4": 0,
"5": 0
}
}
},
{
"date": "2019-11-17",
"phq_attendance_concerts": {
"stats": {
"count": 1,
"sum": 200
}
},
"phq_attendance_conferences": {
"stats": {
"count": 0,
"min": 0,
"max": 0
}
},
"phq_attendance_sports": {
"stats": {
"count": 0,
"median": 0.0,
"std_dev": null
}
},
"phq_rank_public_holidays": {
"rank_levels": {
"1": 0,
"2": 0,
"3": 0,
"4": 0,
"5": 0
}
}
}
]
}
Rank based Feature Fields
These fields aggregate events based on the specified scheduled, non-attendance based category, into different buckets of PHQ rank levels. An explicit feature key will be defined for each phq_rank_{rank based category}
combination. Available fields:
daylight-savings →
phq_rank_daylight_savings
health-warnings →
phq_rank_health_warnings
observances →
phq_rank_observances
public-holidays →
phq_rank_public_holidays
school-holidays →
phq_rank_school_holidays
politics →
phq_rank_politics
academic → see Academic Attendance Features
Note on specifying features
Rank features have no specific filter criteria at this time. They are only activated
by setting the field valid to true
Field | Description |
---|---|
phq_rank_daylight_savings bool optional excluded by default |
Rank daylight-savings feature. To activate set to true .E.g. "phq_rank_daylight_savings": true |
phq_rank_health_warnings bool optional excluded by default |
Rank health-warnings feature. To activate set to true .E.g. "phq_rank_health_warnings": true |
phq_rank_observances bool optional excluded by default |
Rank observances feature. To activate set to true .E.g. "phq_rank_observances": true |
phq_rank_public_holidays bool optional excluded by default |
Rank public-holidays feature. To activate set to true .E.g. "phq_rank_public_holidays": true |
phq_rank_school_holidays bool optional excluded by default |
Rank school-holidays feature. To activate set to true .E.g. "phq_rank_school_holidays": true |
phq_rank_politics bool optional excluded by default |
Rank politics feature. To activate set to true .E.g. "phq_rank_politics": true |
Impact based Feature Fields
These fields aggregate events based on their impact for a given industry vertical for a specified category and subcategory. Currently the only category supported is severe_weather
and the only vertical is retail
. Feature format: phq_impact_{category}_{subcategory}_{vertical}
.
Field | Description |
---|---|
phq_impact_severe_weather_air_quality_retail bool or Impact Criteria optional excluded by default |
Impact air_quality feature.Example 1: "phq_impact_severe_weather_air_quality_retail": true Example 2: "phq_impact_severe_weather_air_quality_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_blizzard_retail bool or Impact Criteria optional excluded by default |
Impact blizzard feature.Example 1: "phq_impact_severe_weather_blizzard_retail": true Example 2: "phq_impact_severe_weather_blizzard_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_cold_wave_retail bool or Impact Criteria optional excluded by default |
Impact cold_wave feature.Example 1: "phq_impact_severe_weather_cold_wave_retail": true Example 2: "phq_impact_severe_weather_cold_wave_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_cold_wave_snow_retail bool or Impact Criteria optional excluded by default |
Impact cold_wave_snow feature.Example 1: "phq_impact_severe_weather_cold_wave_snow_retail": true Example 2: "phq_impact_severe_weather_cold_wave_snow_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_cold_wave_storm_retail bool or Impact Criteria optional excluded by default |
Impact cold_wave_storm feature.Example 1: "phq_impact_severe_weather_cold_wave_storm_retail": true Example 2: "phq_impact_severe_weather_cold_wave_storm_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_dust_retail bool or Impact Criteria optional excluded by default |
Impact dust feature.Example 1: "phq_impact_severe_weather_dust_retail": true Example 2: "phq_impact_severe_weather_dust_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_dust_storm_retail bool or Impact Criteria optional excluded by default |
Impact dust_storm feature.Example 1: "phq_impact_severe_weather_dust_storm_retail": true Example 2: "phq_impact_severe_weather_dust_storm_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_flood_retail bool or Impact Criteria optional excluded by default |
Impact flood feature.Example 1: "phq_impact_severe_weather_flood_retail": true Example 2: "phq_impact_severe_weather_flood_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_heat_wave_retail bool or Impact Criteria optional excluded by default |
Impact heat_wave feature.Example 1: "phq_impact_severe_weather_heat_wave_retail": true Example 2: "phq_impact_severe_weather_heat_wave_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_hurricane_retail bool or Impact Criteria optional excluded by default |
Impact hurricane feature.Example 1: "phq_impact_severe_weather_hurricane_retail": true Example 2: "phq_impact_severe_weather_hurricane_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_thunderstorm_retail bool or Impact Criteria optional excluded by default |
Impact thunderstorm feature.Example 1: "phq_impact_severe_weather_thunderstorm_retail": true Example 2: "phq_impact_severe_weather_thunderstorm_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_tornado_retail bool or Impact Criteria optional excluded by default |
Impact tornado feature.Example 1: "phq_impact_severe_weather_tornado_retail": true Example 2: "phq_impact_severe_weather_tornado_retail": {"stats": ["count", "max"]} |
phq_impact_severe_weather_tropical_storm_retail bool or Impact Criteria optional excluded by default |
Impact tropical_storm feature.Example 1: "phq_impact_severe_weather_tropical_storm_retail": true Example 2: "phq_impact_severe_weather_tropical_storm_retail": {"stats": ["count", "max"]} |
PHQ Academic Feature Fields
An explicit feature key will be defined for each {feature_type}_academic_{type}
combination. There are 5 different academic {types}
- namely:
academic-session
exam
holiday
graduation
social
PHQ Academic Attendance
Aggregates PHQ attendance value for academic events based on the explicit type specified. Only graduation
and social
events are considered attendance based events. This will result in the following academic based feature keys conforming to the structure phq_attendance_academic_{type}
:
graduation →
phq_attendance_academic_graduation
social →
phq_attendance_academic_social
Field | Description |
---|---|
phq_attendance_academic_graduation bool or Attendance Criteria optional excluded by default |
Attendance academic graduation feature.Example 1: "phq_attendance_academic_graduation": true Example 2: "phq_attendance_academic_graduation": {....} |
phq_attendance_academic_social bool or Attendance Criteria optional excluded by default |
Attendance academic social feature.Example 1: "phq_attendance_academic_social": true Example 2: "phq_attendance_academic_social": {.....} |
The filter criteria remain the same as for phq_attendance_{category} features!
PHQ Academic Rank
Aggregates PHQ rank value for academic events based on the explicit type specified. Only academic-session
, exam
, and holiday
are considered rank based events. This will result in the following academic based feature keys conforming to phq_rank_academic_{type}
:
academic-session →
phq_rank_academic_session
exam →
phq_rank_academic_exam
holiday →
phq_rank_academic_holiday
Field | Description |
---|---|
phq_rank_academic_session bool optional excluded by default |
Rank academic session-savings feature. To activate set to true .E.g. "phq_rank_academic_session": true |
phq_rank_academic_exam bool optional excluded by default |
Rank academic exam feature. To activate set to true .E.g. "phq_rank_academic_exam": true |
phq_rank_academic_holiday bool optional excluded by default |
Rank academic holidays feature. To activate set to true .E.g. "phq_rank_academic_holiday": true |
The filter criteria remain the same as for phq_rank_{category} features i.e. there are none.
Base fields
Active Date Range Filter Fields
Supports gt, gte, lt, and lte constraints.
Validations:
Requires both
{gt|gte}
and{lt|lte}
to be specified.{gt|gte}
value must be less than{lt|lte}
value.Date range must be less or equal to 90 days.
Minimum date is
2016-01-01
Maximum date is
{current year + 1}-12-31
Header | Header |
---|---|
gt string optional |
Filter for Active local date > Date - ISO 8601 format.E.g. 2014-07-16 |
gte string optional |
Filter for Active local date >= Date - ISO 8601 format.E.g. 2014-07-16 |
lt string optional |
Filter for Active local date < Date - ISO 8601 format. E.g. 2014-07-16 |
lte string optional |
Filter for Active local date <= Date - ISO 8601 format.E.g. 2014-07-16 |
Location Filter Fields
A location filter is required and this is done by specifying one of the following:
Place filter defined by a list of Place IDs, or
Geo-Distance filter
Filter required
Either a place_id
or a geo
filter criteria must be defined.
Place filter validations:
Supports up to a maximum of three Place IDs
Mutually exclusive with Geo-Distance filter field
Field | Description |
---|---|
place_id string optional* Mutually exclusive with geo |
A comma-separated list of place ids (see Places) E.g. all events that apply to the State of New York: "place_id": [5128638] |
geo string optional* Mutually exclusive with place_id |
Filter for geo-location and radius Note that it can be difficult working out a suitable radius around your location so to make it easier please use our Suggested Radius API. E.g. "lon": -71.49978, "lat": 41.75038, "radius": "30mi" |
GEO-DISTANCE FILTER FIELDS
A filter based on geo-point and distance from, defining a Latitude: {lat}
, Longitude: {lon}
and Radius: {radius}
.
Validations:
-90 <= Latitude <= 90
-180 <= Longitude <= 180
Radius cannot exceed
100 miles
or equivalentallowed units for radius :
km
,mi
,m
,ft
Field | Description |
---|---|
lat number required |
Latitude value E.g. "lat": 41.75038 |
lon number required |
Longitude value E.g. "lon": -71.49978 |
radius string required |
Radius measure and unit value E.g. "radius": "50km" or "radius: "10mi" or "radius": "200m" or "radius": "500ft" |
PHQ Attendance Criteria Fields
Defines the desired statistical measures to be processed and returned as well as provide the means to define a rank filter
The PHQ Attendance criteria is based on the PHQ Statistical Criteria, this may be specifically extended for attendance in the future.
PHQ Viewership Criteria Fields
Defines the desired statistical measures to be processed and returned as well as provide the means to define a rank filter
The PHQ Viewership criteria is based on the PHQ Statistical Criteria, this may be specifically extended for viewership in the future.
PHQ Base Statistical Criteria Fields
Defines the base fields for processing of statistical features.
an optional list of aggregation functions applied to the phq_attendance/phq_viewership value. Validations:
count
andsum
are the default stats being included in the response.Other available functions include:
min
,max
,avg
,median
andstd_dev
.
a optional
phq_rank
based filter - see PHQ Rank Filter
Field | Description |
---|---|
stats list of string optional* * count and sum are processed by default |
List of desired stats to be processed E.g. “stats": ["count", "min", "max”] |
*phq_rank object optional |
Filter for PHQ Rank E.g. “phq_rank":{"gt": 50, "lt", 80} |
PHQ Rank Filter
PHQ rank filter. Supports gt
, gte
, lt
, and lte
constraints.
Validations:
{gt|gte}
constraint value must be less than{lt|lte}
constraint value.Minimum rank is
0
Maximum rank is
100
Field | Description |
---|---|
gt integer optional |
Filter for phq_rank > E.g. "gt": 65 |
gte integer optional |
Filter for phq_rank >= E.g. "gte": 50 |
lt integer optional |
Filter for phq_rank < E.g. "lt": 90 |
lte integer read-only |
Filter for phq_rank <= E.g. "lte": 88 |
Filtering on start and end times
This functionality is currently only available for viewership features. Additional features will be supported in the future. Time of day filters can be used to narrow down on viewership aggregations that are relevant to your hours of operation and other day-parted analyses. To use time of day filters, the hour_of_day_start
section must be populated with a time range between 0 - 23, corresponding to hours in a 24h time format. Only whole numbers are currently supported.
Example of hour of the day filtering.
POST /v1/features/ HTTP/1.1
Accept: application/json
Authorization: Bearer $ACCESS_TOKEN
{
"active": {
"gte": "2019-12-16",
"lte": "2020-01-30"
},
"hour_of_day_start": {
"gte": 13,
"lte": 15
},
"location": {
"place_id": [
"6252001"
]
},
"phq_viewership_sports_american_football": {
"stats": [
"count",
"std_dev",
"median"
],
"phq_rank": {
"gt": 50
}
},
"phq_viewership_sports_american_football_ncaa_men": true,
"phq_viewership_sports_american_football_nfl": true
}