Explore our IP Address Database Downloads for instant access to our IP address insights

Learn more
2 years ago by IPinfo Team 7 min read

What makes IPinfo’s Snowflake integration ideal for data engineers?

What makes IPinfo’s Snowflake integration ideal for data engineers?

Since September 2020, IPinfo and Snowflake have partnered together to make IP data available in the Snowflake Marketplace. This IPinfo-supported integration has enabled a variety of use cases for Snowflake users, ranging from threat intelligence to data security and compliance.

Why use IPinfo in Snowflake?

A major benefit of Snowflake is that users can load data at the same time it's being analyzed. Snowflake is available with Google Cloud, AWS, or Azure cloud computing software, enabling organizations to eliminate data siloes that prevent scalable management and sharing data.

Many IPinfo users have found that merging our accurate data fields with Snowflake’s elastic performance engine, intelligent infrastructure, and secure data sharing systems helps their teams scale everything from interactive workloads to batch.

IPinfo’s data is used in many workloads, including the following:

  • Data engineering
  • Data lakes
  • Data warehouses
  • Data science
  • Data applications
  • Data sharing

In this article, we’re going to highlight some technical resources, effective use cases, industries that use our data in Snowflake, and how we make our data more performant on this platform.

Technical resources

Since users won’t need to move any data out of Snowflake, they can have efficient pipelines for building ML models and running other analytics. Snowflake offers a wide variety of programming languages including Python and programming alternatives to application development in Java or C/C++.

As was already mentioned, Snowflake helps run operations that are expensive computationally and that take significant amounts of time. Snowflake has an elastic performance engine to efficiently distribute the computation of extensive functions so that they can run model interference for one million records in under 10 seconds.

SQL queries also extend Python support to all parts of Snowflake that work with SQL. Let’s also not look past the benefit of using familiar tools and programming languages. This also improves efficiency for data scientists, engineers, and others. Analytics teams can deploy ML fraud prediction models with the scale and performance offered by Python’s open-source language.

Since Python provides a rich ecosystem of open-source libraries, including Anaconda integrations, teams also won’t need additional package installation or run into dependency management issues.

IPinfo’s technical team

IPinfo is built with developers, engineers, and IT specialists in mind. On purpose, we bring on developers and engineers as data experts. So not only do they know IP address data - the implications and types - they also know how to help troubleshoot and innovate use cases.

Our team also stays up to date with trends within Snowflake Marketplace. A good example of this is how we recently made IP data more performant in Snowflake using UDTFs.

UDTFs: Simplifying IP address queries in Snowflake

Because our data maps to IP ranges, rather than specific IPs, range joins can be slow. That's why we created an optimized version. By splitting the data and creating a join key, you can now do an exact join.

In the past, BETWEEN was the intuitive way to conduct raw joins. The problem is that this scales badly. The result is that we get a lot of support requests from users asking to help them to figure out why their queries are slow.

Using the join_key join form will make queries incredibly fast. But all of this would be much better if users didn’t need to worry about the details of how our data is stored or about making the join performant at all in the first place.

With these new UDTFs, users are able to enrich any IP address within Snowflake with our IP geolocation, company, carrier & VPN detection data sets with a simple call to our function - no need to think about using or understanding how to use join_key directly.

It works with IPv4 and IPv6 addresses and scales to massive data sets as represented by the graph below

How to use UDTF functions for data enrichment

Assume you have a table called logs and it has some column ip that stores the string form of an IPv4 or IPv6 address. The following easy-to-understand query would perform a join between your log data and 3 different IP datasets - geolocation, ASN & privacy.

This would generate a table containing first the columns of your log data, including the IP, and then all the columns available from the geolocation, ASN & privacy datasets. To read more about how we created more performant joins, check out our recent article.

All this to say, IPinfo is built by developers for developers.

Not only do we help small organizations use IP data, we regularly accommodate enterprise requests from Fortune 10s to Fortune 500s, including global brand names, popular Silicon Valley tech companies, government organizations and even multi-billion dollar enterprises.

Our Enterprise-Grade Support Packages allow users to choose a dedicated help channel on any one of these platforms:

  • Microsoft Teams
  • Slack
  • Signal
  • Telegram

These support channels give direct access to our engineers who can help troubleshoot use cases involving IPinfo’s data.

Snowflake learning resources

Additionally, Snowflake offers many other learning resources such as these:

In short, using IPinfo’s data in Snowflake’s platform will be a fully-supported experience. IPinfo can help answer your IP address data concerns and Snowflake’s robust learning resources can keep you going.

Notable industries that use IPinfo’s data in Snowflake

A variety of industries use IP address data from IPinfo within Snowflake’s infrastructure. While there are more than we could count, here are some examples of industry-leading organizations who use merge Snowflake’s computational power with IPinfo’s accurate data.

MedTech: Removing data siloes for business intelligence and value-based care

MedTech organizations use IPinfo in Snowflake for a variety of reasons. For starters, Snowflake is HITRUST,  HIPPA, and GxP compliant (among many other Security & Compliance certificates).  

One organization in MedTech requested customized IPinfo datasets through Snowflake’s Marketplace. They use IP address data to analyze logins based on geolocation and detect anonymous IPs for login management.

Snowflake’s platform allows MedTech companies to break down data siloes and merge existing healthcare data with other reliable data sources, such as IP address data.

Other healthcare technology solutions merge IP to company data with existing firmographics to improve Automated Intelligence (AI) and Machine Learning (ML).

Additionally, IP geolocation data helps target the right content to the right patient or consumer to enable better patient education, patient relationship management, and value-based care. This is how Bupa uses IPinfo’s data within remote healthcare. Several healthcare organizations use IPinfo’s data to restrict access to

IPinfo's API is a core part of our website. As a result, we have created a bespoke solution that integrates with our CMS so we can manage the geolocation capability to suit our needs. - Anthony Jaggs, Senior Digital Communications Manager, Bupa

Still, other MedTech organizations use IPinfo’s data to connect the right medical professional with patients based on geolocation. Curam and Inner Hour, two other health technology organizations, use geolocation data for this purpose.

Healthcare is one industry that’s often slow to innovate. But Snowflake empowers HIPPA and HITECH-compliant business intelligence and analytics for MedTech organizations striving to remove data siloes and improve patient care.

Fintech: AI-powered fraud prevention models

Fintech organizations use IPinfo’s data within Snowflake to build AI fraud prediction models.

For instance, Dupaco - a full-service banking and online banking credit union - uses IPinfo’s geolocation data to pinpoint locations of withdrawals or account logins. By sharing this information with customers, they’ve been able to improve alerts and cut down on fraud for customers.

Here’s another example of how customers can create Machine Learning fraud prediction models in Snowflake.

In this fraud prediction model, there are three primary predictions to establish fraud scores.

  1. The IP address is masked
  2. The distance between order placement and shipping location
  3. The average price per item in the transaction

These three points questions are an important part of the model since they can pinpoint users who may be masking their location and purchases.

Another important point to mention from the above example is that Snowflake users benefit from Snowflake’s Dataframe API, which allows concise Python code. But beyond this, fintech users can make predictions at scale by running a model like this on Snowflake. Snowpark - a newer feature that allows developers to bring their favorite tools and deploy them in a serverless manner to Snowflake’s virtual warehouse compute engine - makes running functions (like detect_fraud) very simple.

Using Snowpark Dataframe queries, users can generate predictions for millions of records in a matter of seconds.

This is just one example of the many ways that IPinfo’s accurate data empowers scalable use cases in Snowflake computational software.

For more information about IPinfo for fraud prevention, download the guide.

Cybersecurity: Botnet detection, risk scores, SIEM and regulatory compliance

We’ve also seen Snowflake customers use IPinfo’s data for a variety of cybersecurity purposes. Here’s one video showing how Snowflake customers merge a variety of data sources (such as IPinfo’s geolocation insights) to improve scalable security analytics (or SIEM) and regulatory compliance in the Cloud.

We’ve seen that government organizations benefit from using geolocation, privacy detection, and other datasets for cyberwarfare, monitoring government employee access, managing online assets (such as SOCs), and much more.

For more information, download the full ebook about IPinfo for government organizations. Here’s what one of our cybersecurity customers had to say about IPinfo’s data:

Identifying the true owners and operators of compromised infrastructure has always been a challenge, but IPinfo simplifies the process. Their normalization of the data allows Spur to automate many aspects of an investigation that used to require an analyst. IPInfo provides the foundation for a successful investigation and is a must-have in any security organization. -Thomas Kilmer, Co-founder of Spur Intelligence Corp.

Read more about cybersecurity organizations who’ve benefited from IPinfo’s data by visiting our customer stories page.

Discover more ways to merge IPinfo with Snowflake’s powerful computing cloud data. Connect with a data expert today!

About the author

IPinfo Team

IPinfo Team

Internet Data Expert