Introducing the Forecasts API — Event-driven forecasts for precise demand planning. Fast, accurate, and easy to run.
Explore Now
LogoLogo
Visit websiteWebAppGet DemoTry for Free
  • Introduction
  • Swagger UI
  • Loop
  • System Status
  • Getting Started
    • API Quickstart
    • Data Science Notebooks
    • PredictHQ Data
      • Data Accuracy
      • Event Categories
        • Attendance-Based Events
        • Non-Attendance-Based Events
        • Unscheduled Events
        • Live TV Events
      • Labels
      • Entities
      • Ranks
        • PHQ Rank
        • Local Rank
        • Aviation Rank
      • Predicted Attendance
      • Predicted End Times
      • Predicted Event Spend
      • Predicted Events
      • Predicted Impact Patterns
    • Guides
      • Geolocation Guides
        • Overview
        • Searching by Location
          • Find Events by Latitude/Longitude and Radius
          • Find Events by Place ID
          • Find Events by IATA Code
          • Find Events by Country Code
          • Find Events by Placekey
          • Working with Location-Based Subscriptions
        • Understanding Place Hierarchies
        • Working with Polygons
        • Join Events using Placekey
      • Date and Time Guides
        • Working with Recurring Events
        • Working with Multi-day and Umbrella Events
        • Working with Dates, Times and Timezones
      • Events API Guides
        • Understanding Relevance Field in Event Results
        • Attendance-Based Events Notebooks
        • Non-Attendance-Based Events Notebooks
        • Severe Weather Events Notebooks
        • Academic Events Notebooks
        • Working with Venues Notebook
      • Features API Guides
        • Increase Accuracy with the Features API
        • Get ML Features
        • Demand Forecasting with Event Features
      • Forecasts API Guides
        • Getting Started with Forecasts API
        • Understanding Forecast Accuracy Metrics
        • Troubleshooting Guide for Forecasts API
      • Live TV Event Guides
        • Find Broadcasts by County Place ID
        • Find Broadcasts by Latitude and Longitude
        • Find all Broadcasts for an Event
        • Find Broadcasts for Specific Sport Types
        • Aggregating Live TV Events
        • Live TV Events Notebooks
      • Beam Guides
        • ML Features by Location
        • ML Features by Group
      • Demand Surge API Guides
        • Demand Surge Notebook
      • Guide to Protecting PredictHQ Data
      • Streamlit Demo Apps
      • Guide to Bulk Export Data via the WebApp
      • Industry-Specific Event Filters
      • Tutorials
        • Filtering and Finding Relevant Events
        • Improving Demand Forecasting Models with Event Features
        • Using Event Data in Power BI
        • Using Event Data in Tableau
        • Connecting to PredictHQ APIs with Microsoft Excel
        • Loading Event Data into a Data Warehouse
        • Displaying Events in a Heatmap Calendar
        • Displaying Events on a Map
    • Tutorials by Use Case
      • Demand Forecasting with ML Models
      • Dynamic Pricing
      • Inventory Management
      • Workforce Optimization
      • Visualization and Insights
  • Integrations
    • Integration Guides
      • Keep Data Updated via API
      • Integrate with Beam
      • Integrate with Loop Links
    • Third-Party Integrations
      • Receive Data via Snowflake
        • Example SQL Queries for Snowflake
        • Snowflake Data Science Guide
          • Snowpark Method Guide
          • SQL Method Guide
      • Receive Data via AWS Data Exchange
        • CSV/Parquet Data Structure for ADX
        • NDJSON Data Structure for ADX
      • Integrate with Databricks
      • Integrate with Tableau
      • Integrate with a Demand Forecast in PowerBI
      • Google Cloud BigQuery
    • PredictHQ SDKs
      • Python SDK
      • Javascript SDK
  • API Reference
    • API Overview
      • Authenticating
      • API Specs
      • Rate Limits
      • Pagination
      • API Changes
      • Attribution
      • Troubleshooting
    • Events
      • Search Events
      • Get Event Counts
    • Broadcasts
      • Search Broadcasts
      • Get Broadcasts Count
    • Features
      • Get ML Features
    • Forecasts
      • Models
        • Create Model
        • Update Model
        • Replace Model
        • Delete Model
        • Search Models
        • Get Model
        • Train Model
      • Demand Data
        • Upload Demand Data
        • Get Demand Data
      • Forecasts
        • Get Forecast
      • Algorithms
        • Get Algorithms
    • Beam
      • Create an Analysis
      • Upload Demand Data
      • Search Analyses
      • Get an Analysis
      • Update an Analysis
      • Partially Update an Analysis
      • Get Correlation Results
      • Get Feature Importance
      • Refresh an Analysis
      • Delete an Analysis
      • Analysis Groups
        • Create an Analysis Group
        • Get an Analysis Group
        • Search Analysis Groups
        • Update an Analysis Group
        • Partially Update an Analysis Group
        • Refresh an Analysis Group
        • Delete an Analysis Group
        • Get Feature Importance for an Analysis Group
    • Demand Surge
      • Get Demand Surges
    • Suggested Radius
      • Get Suggested Radius
    • Saved Locations
      • Create a Saved Location
      • Search Saved Locations
      • Get a Saved Location
      • Search Events for a Saved Location
      • Update a Saved Location
      • Delete a Saved Location
    • Loop
      • Loop Links
        • Create a Loop Link
        • Search Loop Links
        • Get a Loop Link
        • Update a Loop Link
        • Delete a Loop Link
      • Loop Settings
        • Get Loop Settings
        • Update Loop Settings
      • Loop Submissions
        • Search Submitted Events
      • Loop Feedback
        • Search Feedback
    • Places
      • Search Places
      • Get Place Hierarchies
  • WebApp Support
    • WebApp Overview
      • Using the WebApp
      • API Tools
      • Events Search
      • How to Create an API Token
    • Getting Started
      • Can I Give PredictHQ a Go on a Free Trial Basis?
      • How Do I Get in Touch if I Need Help?
      • Using AWS Data Exchange to Access PredictHQ Events Data
      • Using Snowflake to Access PredictHQ Events Data
      • What Happens at the End of My Free Trial?
      • Export Events Data from the WebApp
    • Account Management
      • Managing your Account Settings
      • How Do I Change My Name in My Account?
      • How Do I Change My Password?
      • How Do I Delete My Account?
      • How Do I Invite People Into My Organization?
      • How Do I Log In With My Google or LinkedIn Account?
      • How Do I Update My Email Address?
      • I Signed Up Using My Google/LinkedIn Account, but I Want To Log In With My Own Email
    • API Plans, Pricing & Billing
      • Do I Need To Provide Credit Card Details for the 14-Day Trial?
      • How Do I Cancel My API Subscription?
      • Learn About Our 14-Day Trial
      • What Are the Definitions for "Storing" and "Caching"?
      • What Attribution Do I Have To Give PredictHQ?
      • What Does "Commercial Use" Mean?
      • What Happens If I Go Over My API Plan's Rate Limit?
    • FAQ
      • How Does PredictHQ Support Placekey?
      • Using Power BI and Tableau With PredictHQ Data
      • Can I Download a CSV of Your Data?
      • Can I Suggest a New Event Category?
      • Does PredictHQ Have Historical Event Data?
      • Is There a PredictHQ Mobile App?
      • What Are Labels?
      • What Countries Do You Have School Holidays For?
      • What Do The Different Event Ranks Mean?
      • What Does Event Visibility Window Mean?
      • What Is the Difference Between an Observed Holiday and an Observance?
    • Tools
      • Is PHQ Attendance Available for All Categories?
      • See Event Trends in the WebApp
      • What is Event Trends?
      • Live TV Events
        • What is Live TV Events?
        • Can You Access Live TV Events via the WebApp?
        • How Do I Integrate Live TV Events into Forecasting Models?
      • Labels
        • What Does the Closed-Doors Label Mean?
    • Beam (Relevancy Engine)
      • An Overview of Beam - Relevancy Engine
      • Creating an Analysis in Beam
      • Uploading Your Demand Data to Beam
      • Viewing the List of Analysis in Beam
      • Viewing the Table of Results in Beam
      • Viewing the Category Importance Information in Beam
      • Feature Importance With Beam - Find the ML Features to Use in Your Forecasts
      • Beam Value Quantification
      • Exporting Correlation Data With Beam
      • Getting More Details on a Date on the Beam Graph
      • Grouping Analyses in Beam
      • Using the Beam Graph
      • Viewing the Time Series Impact Analysis in Beam
    • Location Insights
      • An Overview of Location Insights
      • How to Set a Default Location
      • How Do I Add a Location?
      • How Do I Edit a Location?
      • How Do I Share Location Insights With My Team?
      • How Do I View Details for One Location?
      • How Do I View My Saved Locations as a List?
      • Search and View Event Impact in Location Insights
      • What Do Each of the Columns Mean?
      • What Is the Difference Between Center Point & Radius and City, State, Country?
Powered by GitBook

PredictHQ

  • Terms of Service
  • Privacy Policy
  • GitHub

© 2025 PredictHQ Ltd

On this page
  • This guide and the Features API
  • Overview
  • Get the Suggested Radius for each Location
  • Create Input table used in both methods
  • Choose which Method to use
  • 1. Snowpark Method Guide
  • 2. SQL Method Guide
  • Integrating PredictHQ Features into Your Demand Forecasting Model

Was this helpful?

  1. Integrations
  2. Third-Party Integrations
  3. Receive Data via Snowflake

Snowflake Data Science Guide

Transforming Event Data into ML-Ready Features in Snowflake

PreviousExample SQL Queries for SnowflakeNextSnowpark Method Guide

Last updated 12 months ago

Was this helpful?

This guide and the Features API

This guide is intended to provide guidance on generating machine learning ready features from PredictHQ's intelligent event data in Snowflake similar to the output of the , but is not intended to be in parity with the comprehensive results of the Features API. If possible, our primary recommendation is to use the Features API as it provides more comprehensive results. For more information on the Features API, please go to and for a more detailed guide on using the Features API for Machine Learning, please see the guide.

If you don't know which is the best for you, please !

Overview

This guide assumes you have access to PredictHQ’s events data in Snowflake via a . The guide is to show customers how to generate aggregations similar to those provided by the Features API documented in the Features API . These features provide daily aggregated data which shows the sum of data for all events happening in a location - for example, the amount of people attending events around a location. The goal is to generate aggregated features that can be used in demand forecasting.

Snowflake's ease of use and integration have made it a popular choice as a cloud data warehouse and is one of the ways PredictHQ's intelligent event data can be accessed for various use cases. In this guide, machine learning ready daily level aggregated event data will be generated on a per-location basis intended to be similar to the data provided by the PredictHQ Features API and ready to be added to a training set as quickly as possible.

Requirements:

1. A list of locations with their respective latitude & longitude.

2. Access to PredictHQ's intelligent event data via Snowflake

3. Understanding of SQL & Snowflake's SQL Interface.

The process to follow:

  1. Get the Suggested Radius for Each Location

  2. Create the input table filled with locations

  3. Set the date range that you want to retrieve data for

  4. Choose which method to follow

  5. Use the output in machine learning demand forecasting models or for other applications

Get the Suggested Radius for each Location

When querying events at the location level, a common way to retrieve those events is with a latitude, longitude, and radius to get the events within a given area. But, a common gap is knowing what radius to use when searching for events. Insert the Suggested Radius API.

# The code below uses the PredictHQ Python SDK
# https://docs.predicthq.com/integrations/sdks/python-sdk
from predicthq import Client 

# Please copy paste your access token here
# or read our Quickstart documentation if you don't have a token yet
# https://docs.predicthq.com/guides/quickstart/
ACCESS_TOKEN = 'ABC123'

phq_client = Client(access_token=ACCESS_TOKEN)

# Specify a list of locations to retrieve the suggested radius for
list_of_locations = [
    {'name': 'store1-chicago', 'latitude': '41.81310', 'longitude': '-87.65860', 'industry': 'retail'},
    {'name': 'Hyde Park', 'latitude': '51.50736', 'longitude': '-0.16411', 'industry': 'accommodation'},
    {'name': 'store10-new-york', 'latitude': '40.730610', 'longitude': '-73.935242', 'industry': 'retail'},
]

for location in list_of_locations:
	# Get suggested radius for a given location and the industry of interest
	# to be used when retrieving events. supported industries: 'parking', 'restaurants', 'retail', 'accommodation'
	suggested_radius = phq_client.radius.search(
            location__origin=f"{location['latitude']},{location['longitude']}", 
            radius_unit="mi", 
            industry=location["industry"]
    )
	print(f"Suggested Radius for {location['name']} with the industry {location['industry']}: {suggested_radius.radius} {suggested_radius.radius_unit}")

Create Input table used in both methods

To be able to reference the locations further in the guide, a table of locations is required as input (called SAVED_LOCATIONS). Please fill this table with The SAVED_LOCATIONS input table requires this format:

location
latitude
longitude
radius
radius_unit
date_start
date_end

store1-chicago

41.81310

-87.65860

4.11

mi

2023-07-01

2023-12-31

Hyde Park

51.50736

-0.16411

2.06

mi

2024-01-01

2024-03-31

store10-new-york

40.73061

-73.93524

null

...

...

...

  • location: a unique identifier for the location.

  • latitude/longitude: it is recommended to include 5 decimal places.

  • radius: the value returned from the using the Suggested Radius API.

  • radius_unit: coded for either “km” (kilometers) or “mi” (miles).

  • date_start/date_end: the date range for the data to be returned. Can be changed.

Here is the input table used when running this code. Note the datatypes of each column for the inputs.

CREATE OR REPLACE TEMP TABLE saved_locations (
    location STRING
    , lat STRING
    , lon STRING
    , radius FLOAT
    , radius_unit STRING
    , date_start DATE
    , date_end DATE);

INSERT INTO saved_locations (location, lat, lon, radius, radius_unit, date_start, date_end)
VALUES ('Hyde Park', '51.5073638', '-0.1641135', 2.06, 'mi'   
        , DATE_TRUNC('MONTH', DATEADD(MONTH, -3, CURRENT_DATE()))       --Start of 3 months ago
        , DATEADD(DAY, -1, DATE_TRUNC('MONTH', CURRENT_DATE()))         --End of last month
        )
;

By default, 3 months of historical data is returned. If the model is being trained, we recommend, at minimum, two years of historical data, but this can be changed as needed. If you are forecasting for a future period then the date range should reflect the period you are forecasting for - e.g. the next 2 weeks.

Once the input table is in the format of the above, the below code shapes that table to be in a day-by-day format of the input called SAVED_LOCATIONS_DAILY:

----split the table out into having one day equals one row between the date range.
CREATE OR REPLACE TEMP TABLE saved_locations_daily AS
select
    date(date_start) + value::int as date,
    s.location,
    s.lat,
    s.lon,
    s.radius,
    lower(s.radius_unit) as radius_unit        --forcing the lower in case of data entry mistakes
   from saved_locations s,
     table(flatten(array_generate_range(0, datediff('day', date_start, date_end) + 1))) t
;

Choose which Method to use

Call the Features API with Python and save the output ML Features into Snowflake.

Use SQL in Snowflake to run over the events table and create the ML Features.

Integrating PredictHQ Features into Your Demand Forecasting Model

The ML_FEATURES_FOR_LOCATIONS table offers ready-to-use features for forecasting models. Merge these features with current demand data using the location and date as keys. Train models using historical data from this table and use future data for forecasting.

The returns a radius that can be used to find attended events around a given location and takes into account a number of different factors like population density, the surrounding street network, and the industry vertical of the location.

As a first step, it is recommended to get the suggested radius for each location before moving forward with the guide. Below is an example of how to query the Suggested Radius API for a list of locations using the . Note: this code will need to be run in a separate environment than Snowflake.

For more information on the Suggested Radius API, visit .

1.

2.

For more information on creating machine learning-ready features from PredictHQ’s intelligent event data, check out the guide or this blog on . While these resources discuss the Features API, remember to use the ML_FEATURES_FOR_LOCATIONS table instead.

Features API
this page
Get ML Features
reach out
Snowflake data share
list of available features
Snowpark Method
SQL Method
Suggested Radius API
PredictHQ Python SDK
our documentation
Snowpark Method Guide
SQL Method Guide
Get ML Features
Enhancing Demand Forecasting with PredictHQ and PowerBI: A Technical Exploration