Have you ever ever wished extra granular management over the way you question information from CFBD? By extra granular management, I imply dynamic filtering and sorting, querying associated items of knowledge in a single question, and even the flexibility to specify which particular fields you need to be queried.
What about higher real-time information help within the type of subscriptions? The REST API provides a number of dwell endpoints that require fixed polling, however I am speaking about with the ability to create a particular information question, subscribing to that question, and your individual code being notified in actual time when the info in that question adjustments. And that is far past the few dwell REST endpoints provided right now. Think about with the ability to subscribe to betting line updates, for instance.
The experimental CFBD GraphQL API can allow you to do all of this and it’s out there to Patreon Tier 3 subscribers beginning right now. I put emphasis on the phrase experimental. It doesn’t but have full entry to your entire CFBD information catalog, however it does incorporate a good quantity as of proper now:
Crew info
Convention info
Historic group/convention associations
Historic and dwell recreation information (scores, Elo scores, pleasure index, climate, media info)
Historic and dwell betting information
Recruiting information
Switch information
NFL Draft historical past
Issues that aren’t at the moment included however will probably be added over time:
Drive and play information
Fundamental recreation, participant, and season stats
Superior recreation, participant, and season stats
Neither of those lists are exhaustive.
If you want to be taught and see some examples, then learn on.
What’s GraphQL?
GraphQL is a question language for APIs. Its central premise is that it defines a knowledge mannequin as a “graph” of attributes and relationships. When interfacing with such an API, you specify precisely which information you want, the way it ought to be filtered, the way it ought to be sorted, and it has paging skills to seize information in batches. That is a lot totally different than a standard REST API the place you’re given a concrete set of REST endpoints with discrete question parameters and a inflexible information mannequin response.
So how does it work otherwise from working with REST endpoints? The humorous factor is, it mainly is a REST endpoint. In contrast to conventional REST APIs the place you’ll doubtless have many various endpoints scattered throughout a number of totally different HTTP operations (e.g. GET, POST, PUT, and so forth), GraphQL exposes a single POST endpoint, normally named simply graphql. You submit a POST request to that endpoint and the request physique accommodates all of the details about what you are attempting to do and what information you need to obtain again, all in GraphQL syntax.
Right here is a straightforward GraphQL question utilizing the brand new CFBD GraphQL endpoint:
question gamesQuery {
recreation(the place: { season: { _eq: 2024 } }, orderBy: { startDate: ASC }) {
id
season
seasonType
week
startDate
homeTeam
homeClassification
homeConferece
homePoints
awayTeam
awayClassification
awayConferece
awayPoints
strains {
supplier {
identify
}
unfold
}
}
}
GraphQL provides three kinds of operations: queries for querying information, mutations for altering information, and subscriptions for subscribing to information updates. The above instance is a question named gamesQuery. The question half is vital because it tells the API that we’re querying for information, however the gamesQuery half is totally arbitrary. In actual fact, we might have utterly left off question gamesQuery and the API would implicitly know we are attempting to question information.
The fascinating stuff begins on line 2. There’s a recreation object that’s made out there within the graph and we’re telling the API that we need to question these objects. We’re additionally together with some filtering and sorting on this line. We’re telling the API to return video games from the 2024 season and to kind by the start_date property.
Let’s take a look at the filter somewhat extra intently: the place: { season: { _eq: 2024 } }. We’re utilizing an equal operator (_eq) to filter on the 2024 season, however there are various extra operators. For instance, we might use _gt if we wished to question on seasons better than a particular yr. We will additionally mix filters. For example we wished to question video games from the 2024 season, however solely in weeks 1, 3, and 5. We might do one thing like this: the place: { season: { _eq: 2024 }, week: { _in: [1, 3, 5] } }. We’ll have a look at some extra complicated eventualities afterward.
We even have an ordering assertion: orderBy: { startDate: ASC }. This tells the API to kind the outcomes by the startDate discipline in ascending order. Much like filters, we will mix these if we need to kind by a number of fields. And we will specify whether or not we need to kind in ascending or descending order on every discipline.
As we proceed previous line 2, you may see that we’re additionally in a position to specify which recreation object fields we wish returned again within the question. On line 16, we introduce one other object within the graph by way of the strains property. We have now an entire gameLines object that we might write a separate question on. Nevertheless, we even have a relationship between video games and recreation strains by way of the strains property. Due to this, we will inform the API to return any recreation strains related to every recreation object. We will additionally specify which properties we need to be returned in these nested relationships. Notably, you will see that we have now one other relationship nested inside a relationship, because the supplier object has a relationship with the strains object. supplier offers info on the sportsbook that gives the sport line.
We have gotten this far, so we must always in all probability have a look at the info that will get returned by this question.
…
{
“id”: 401635525,
“season”: 2024,
“seasonType”: “common”,
“week”: 1,
“startDate”: “2024-08-24T16:00:00”,
“homeTeam”: “Georgia Tech”,
“homeClassification”: “fbs”,
“homeConferece”: “ACC”,
“homePoints”: 24,
“awayTeam”: “Florida State”,
“awayClassification”: “fbs”,
“awayConferece”: “ACC”,
“awayPoints”: 21,
“strains”: [
{
“provider”: {
“name”: “ESPN Bet”
},
“spread”: 10.5
},
{
“provider”: {
“name”: “DraftKings”
},
“spread”: 11.5
},
{
“provider”: {
“name”: “Bovada”
},
“spread”: 10.0
}
]
},
…
As you may see, it matches the format and fields that we specified within the question. Let’s write one other question with somewhat bit extra complexity. I need to question essentially the most thrilling video games of the previous 10 seasons as measured by the CFBD Pleasure Index metrics. My question would seem like this:
question excitementQuery {
recreation(
the place: { season: { _gte: 2014 }, pleasure: { _isNull: false } }
orderBy: { pleasure: DESC }
restrict: 100
) {
id
season
seasonType
week
startDate
homeTeam
homeClassification
homeConferece
homePoints
awayTeam
awayClassification
awayConferece
awayPoints
pleasure
}
}
I am writing this text proper firstly of the 2024 season, so I’ve up to date my filter, the place: { season: { _gte: 2014 }, pleasure: { _isNull: false } } to question all video games beginning with the 2014 season the place the joy discipline just isn’t null or empty. I additionally included a kind clause, orderBy: { pleasure: DESC }, as a result of I need to kind by pleasure in descending order in order that essentially the most thrilling video games are returned on the prime. Lastly, I specified a restrict of 100 outcomes (restrict: 100) as a result of I solely need the highest 100 most enjoyable video games.
Listed below are the partial outcomes of that question:
{
“information”: {
“recreation”: [
{
“id”: 401282177,
“season”: 2021,
“seasonType”: “regular”,
“week”: 1,
“startDate”: “2021-09-05T00:00:00”,
“homeTeam”: “South Alabama”,
“homeClassification”: “fbs”,
“homeConferece”: “SBC”,
“homePoints”: 31,
“awayTeam”: “Southern Mississippi”,
“awayClassification”: “fbs”,
“awayConferece”: “CUSA”,
“awayPoints”: 7,
“excitement”: 21.5355699358
},
{
“id”: 401418780,
“season”: 2022,
“seasonType”: “regular”,
“week”: 9,
“startDate”: “2022-10-29T21:00:00”,
“homeTeam”: “Central Arkansas”,
“homeClassification”: “fcs”,
“homeConferece”: “ASUN”,
“homePoints”: 64,
“awayTeam”: “North Alabama”,
“awayClassification”: “fcs”,
“awayConferece”: “ASUN”,
“awayPoints”: 29,
“excitement”: 16.5218277643
},
{
“id”: 401416599,
“season”: 2022,
“seasonType”: “regular”,
“week”: 2,
“startDate”: “2022-09-10T22:00:00”,
“homeTeam”: “Miami (OH)”,
“homeClassification”: “fbs”,
“homeConferece”: “MAC”,
“homePoints”: 31,
“awayTeam”: “Robert Morris”,
“awayClassification”: “fcs”,
“awayConferece”: null,
“awayPoints”: 14,
“excitement”: 15.5860040950
},
…
]
}
}
Within the subsequent few sections, we’ll dive into easy methods to question from the CFBD GraphQL API utilizing Insomnia and Python.
Utilizing the CFBD GraphQL API with Insomnia
If you have not seen my put up on utilizing Insomnia with the CFBD API, then make sure you test it out. Insomnia is by far one of the best device for experimenting with totally different APIs. Not solely is it incredible for experimenting with conventional REST calls, however it additionally has actually nice GraphQL help. This part of the information assumes you’re accustomed to Insomnia and have it arrange.
So let’s go forward and open up Insomnia. You’ll create a brand new request identical to you usually would, however this time choose “GraphQL Request” from the dropdown.
The brand new request ought to look actually much like a POST request and even be labeled as such. Earlier than we fill within the URL, we’ll add our Auth particulars. Choose “Bearer Token” from the Auth dropdown.
Within the Token discipline, fill in your API key. Will probably be the identical API key you employ on the CFBD API. There is no such thing as a want so as to add a Bearer prefix or the rest. Simply paste in your key.
Now go forward and fill out the URL: https://graphql.collegefootballdata.com/v1/graphql. After pasting that in, click on on “schema” and choose “Refresh Schema”. Additionally, ensure that “Computerized Fetch” is enabled.
Click on on “Present Documentation” from the identical dropdown will open up a documentation aspect panel on the fitting. From the aspect panel, click on on query_root to see which queries can be found.
These docs are interactive, you be happy to click on round to be taught concerning the totally different queries and kinds. Nevertheless, these docs aren’t even essential to get going however I did need to level them out as a result of it is nonetheless a really good function.
Go forward and click on on the GraphQL tab, click on within the code physique, after which hit Ctrl+Area. The code editor has full autocomplete capabilities.
As you kind out queries, you should utilize this performance to information you with out even needing to actually know or reference the documentation.
Let’s question some recruiting information. I need to question each #1 general highschool recruit for the reason that 2014 cycle. Moreover, I need to order by general composite ranking, with the very best scores on the prime. My question would seem like this:
question myQuery {
recruit(
the place: {
yr: { _gte: 2014 }
overallRank: { _eq: 1 }
recruitType: { _eq: “HighSchool” }
}
orderBy: { ranking: DESC }
) {
ranking
identify
place {
place
positionGroup
}
faculty {
college
convention
}
recruitSchool {
identify
}
}
}
Be at liberty to fiddle with the question. Choose no matter fields you need to return and tweak the filters and the types in case you need to take action. When you’re happy, go forward and submit. That is what my question returned again:
I am truly interested in my hometown. I come from a very tiny city in northern Ohio referred to as Huron. I want to know if there have been any reliable recruits within the recruiting service period to hail from there. Once I performed (early aughts), the recruiting providers the place simply turning into a factor and we did not actually have any FBS-level gamers. We had a very nice TE named Jim Fisher who performed at Michigan and would have match the invoice, however he was a yr or two earlier than my time and earlier than Rivals and Scout received large.
Anyway, here is the question I drew up.
question myQuery {
recruit(
the place: {
recruitType: { _eq: “HighSchool” }
hometown: { metropolis: { _eq: “Huron” }, state: { _eq: “OH” } }
}
orderBy: { ranking: DESC }
) {
stars
rating
positionRank
ranking
identify
place {
place
positionGroup
}
faculty {
college
convention
}
recruitSchool {
identify
}
hometown {
metropolis
state
}
}
}
And listed below are the outcomes:
We have had one lone 2* WR who ended up at Toledo. Strategy to go, Cody!
I can barely modify this question if I need to filter historic recruits by any geographic area. Like if I wished to question all-time recruits from the state of Alaska:
We will even do aggregates. For instance, if I wished to seek out imply stars and scores and their respective commonplace deviations for all Michigan recruits since 2016, I might run one thing just like the under:
question myQuery {
recruitAggregate(
the place: {
faculty: { college: { _eq: “Michigan” } }
yr: { _gte: 2016 }
recruitType: { _eq: “HighSchool” }
}
) {
mixture {
rely
avg {
ranking
stars
}
stddev {
ranking
stars
}
}
}
}
Listed below are the outcomes:
Utilizing the CFBD GraphQL API with Python
I’ll preface this part by stating you can interface with GraphQL APIs utilizing nearly any programming. All of it quantities to a fundamental HTTP POST request in any case. If you may make an HTTP request, you may make a GraphQL request. That every one stated, some instruments and libraries make issues a lot simpler. If I am being trustworthy, TypeScript/JavaScript is one of the best ecosystem for working with GraphQL. Very like Python is basically unparalleled in terms of libraries out there for information science and machine studying, the TypeScript/JavaScript ecosystem is unparalleled in terms of libraries and utilities for GraphQL.
Nevertheless, I acknowledged that a big majority of CFBD customers are working in Python. And admittedly, Python might be nonetheless the right selection for you in case you are working in information and analytics. Fortunately, Python does have its personal set of libraries for working with GraphQL.
GQL is without doubt one of the extra fashionable packages for interfacing with GraphQL APIs in Python. We will set up it from PyPI:
pip set up “gql[all]”
Or in case you’re utilizing Conda:
conda set up gql-with-all
Throughout this part, I will probably be operating my Python code out of a Jupyter pocket book. Nevertheless, it’s best to be capable of run this similar code even in case you aren’t operating in Jupyter.
We’ll begin off by importing packages from GQL:
from gql import Shopper, gql
from gql.transport.aiohttp import AIOHTTPTransport
Subsequent, we’ll create a transport across the CFBD GraphQL URL and GraphQL shopper round this transport.
transport = AIOHTTPTransport(
url=”https://graphql.collegefootballdata.com/v1/graphql”,
headers={ “Authorization”: “Bearer YOUR_API_KEY_HERE”}
)
shopper = Shopper(transport=transport, fetch_schema_from_transport=True)
Be aware that that is additionally the place you should configure your API. Change YOUR_API_KEY_HERE within the above snippet with the API key you employ for the CFBD API. Discover that we do want to produce a “Bearer ” prefix right here.
I will mirror the earlier part on utilizing Insomnia. When you skipped it, I extremely suggest checking it out. I discover it is normally simpler to design GraphQL queries in Insomnia previous to placing them into Python code.
Executing the identical question, which grabs all #1 general highschool recruits since 2014 and sorting in descending order of Composite ranking appears to be like like this:
question = gql(
“””
question myQuery {
recruit(
the place: {
yr: { _gte: 2014 }
overallRank: { _eq: 1 }
recruitType: { _eq: “HighSchool” }
}
orderBy: { ranking: DESC }
) {
ranking
identify
place {
place
positionGroup
}
faculty {
college
convention
}
recruitSchool {
identify
}
}
}
“””
)
outcome = await shopper.execute_async(question)
outcome
That is what the output appears to be like like in my Jupyter pocket book.
We will run kind(outcome) to see that result’s a dict. It ought to be comparatively straightforward to loop via this outcome and format it to our liking.
We will flatten all the dicts to make them simpler to place right into a DataFrame:
formatted = [dict(rating=r[‘rating’], identify=r[‘name’], faculty=r[‘college’][‘school’], place=r[‘position’][‘position’]) for r in outcome[‘recruit’]]
formatted
We will now simply get this right into a pandas DataFrame.
import pandas as pd
df = pd.DataFrame(formatted)
df.head()
Let’s run one other question. This time I’m going to question Michigan’s historic entries within the AP ballot, sorted with the latest appearances first.
question = gql(
“””
question myQuery {
pollRank(
the place: {
group: { college: { _eq: “Michigan” } }
ballot: { pollType: { identify: { _eq: “AP High 25” } } }
}
orderBy: [
{ poll: { season: DESC } }
{ poll: { seasonType: DESC } }
{ poll: { week: DESC } }
]
) {
rank
factors
firstPlaceVotes
ballot {
season
seasonType
week
pollType {
identify
}
}
}
}
“””
)
outcome = await shopper.execute_async(question)
outcome
We will once more flatten this and cargo it right into a DataFrame if we need, however I will depart that as much as you.
Conclusion
I hope that illustrates the ability of GraphQL and what it could possibly do for you. It permits for far more flexibility and fewer restrictions. I get requests on a regular basis for querying the info in several methods or totally different codecs or permitting various kinds of question parameters. This may be very troublesome to maintain up with and keep in a standard REST API, however is straightforward work when working with GraphQL.
Once more, that is out there to you in case you are a Patreon Tier 3 subscriber. Bought to Patreon in case you are fascinated about checking it out. I’ll reiterate that that is very experimental proper now. If there are items of knowledge out there within the REST API that you simply want to see right here, I’m within the technique of including an increasing number of information. One other big profit is real-time GraphQL subscriptions, however I will save that for a future put up. If you find yourself checking it out, let me know what you assume!