Schemathesis progress report
Its been slightly more than a year since I started working on Schemathesis, and in this article, I want to give a summary of what we accomplished during this year and what you can expect from this project in the future.
Target audience: People curious about effective ways to test web APIs and willing to spend less time writing tests.
ANNOUNCE: I build a service for API fuzzing. Sign up to check your API now!
What is Schemathesis?
When you develop a web API, it is always a good practice to let the users know how to communicate with it to use your API easily. And one of the most effective solutions is to provide a machine and human-readable schema that gives information about all contracts that the API should hold.
However, it implies additional challenges, and the most important of them is keeping the schema in sync with the actual API behavior. If the API behaves differently than described in its schema, it may ignore some payload parts silently, or worse - the clients might not be able to communicate with it at all.
You, as a developer, should ensure that the actual API behavior conforms to the declared schema. Unfortunately, it often involves tedious manual synchronization and huge test suites. It is expensive and error-prone and still doesn't cover many real-life situations.
Here comes Schemathesis - it allows you to test your API for conformance with its schema. And you don’t have to write tests or add examples to the schema - Schemathesis will generate everything needed. It can be as simple as running a single command:
$ schemathesis run https://api.company.com/api/openapi.json
It will run hundreds of tests to verify that your application works as intended without any config by using property-based testing backed by Hypothesis.
Besides CLI, you can easily use it in your Python tests:
import schemathesis
schema = schemathesis.from_uri("http://0.0.0.0:8080/swagger.json")
@schema.parametrize()
def test_no_server_errors(case):
response = case.call()
case.validate_response(response)
Here are some error examples Schemathesis can find in real-life applications:
- Ignored request's payload because of a typo in the field name;
- Obsolete fields that caused DB-level errors. They were not removed from the API schema after a database migration;
- Denial of Service by using non-common values in the request's payload;
- Unexpected responses from the application under test;
- Regular expressions in input validation that caused DB data corruption or server errors;
- Many more cases that lead to server-side errors.
Some of these errors also happened on production, but Schemathesis was able to trigger them in a few seconds during a test run.
There are many aspects of how it can help you, but most importantly, Schemathesis:
- Significantly decreases the costs of finding defects in your APIs;
- Helps you to deliver faster by making errors visible;
- Is easily customizable with your code so it can fit your workflows.
It supports Open API 2/3 and GraphQL, and we plan to support more specs in the future, including Async API, RAML, and API Blueprint.
What do we have so far?
Everything started with a small library that creates Hypothesis strategies for Swagger schemas. Then I implemented support for Open API 3 and recently added GraphQL integration.
The most significant focus for me was CLI, and over time it became a full-featured way to run Schemathesis tests.
Stateful testing
Via Open API links, tests can cover multiple endpoints and generate whole sequences of API calls by reusing requests and responses from one endpoint in requests to another.
If your API uses Open API links, add --stateful=links
and enjoy stateful testing.
/users:
post:
...
responses:
'201':
...
links:
GetUserByUserId:
operationId: getUser
parameters:
userId: '$response.body#/id'
/users/{userId}:
get:
operationId: getUser
parameters:
- in: path
name: userId
required: true
schema:
type: integer
Using the schema above, Schemathesis will create users via POST /users
and then will reuse returned IDs to make GET /users/{userId}
requests.
VCR cassettes
It is easily possible to record all interactions during tests and then replay them.
$ schemathesis run --store-network-log=cassette.yaml https://api.com/openapi.json
This command will create a new YAML file that will network interactions in VCR format.
And the following command will replay only failed requests:
$ schemathesis replay cassette.yaml --status=FAILURE
Explicit examples
Schemathesis verifies examples that are explicitly defined in schemas.
paths:
get:
parameters:
- in: body
name: body
required: true
schema: '#/definitions/Pet'
definitions:
Pet:
additionalProperties: false
example:
name: Doggo
properties:
name:
type: string
required:
- name
type: object
Don’t worry if you don’t have examples for all properties - Schemathesis will generate missing parts for you.
Multithreaded execution
You can execute tests in multiple threads, which may significantly improve the testing performance. The following command will run tests in four threads:
$ schemathesis run --workers 4 https://api.company.com/api/openapi.json
There are many more other features! Take a look at our documentation
I am incredibly pleased to see Schemathesis adoption and how it helps with finding bugs in real applications. For example, in this master’s thesis, Patrik Segedy shows how an extended version of Schemathesis found 32 bugs in the RedHat Insights application.
Another study from Jukka Pajukangas gives an analysis of five tools for API testing and concludes:
Analysis of results revealed that tool named Schemathesis produced best results in most of the situations.
Folks from IBM implemented an open-source tool that extends Schemathesis and tests IBM cloud services for schema compliance.
What is next?
When I started this project, my main goal was to decrease the amount of effort needed to test a web API, and I want to drive it even further.
Schemathesis.io
We are currently working on a service that will provide a single-click testing experience without leaving your browser. It is based on Schemathesis, but with cool extras like:
- Schema improvement recommendations
- Much faster test generation powered by Rust extensions
- Test reports & scheduled runs
- Many more checks with in-browser customization
- AI-powered inference for stateful testing
- Open API callbacks testing
- API, webhooks, and integrations with popular CI providers
Even though this is a closed source service, I plan to use it as a primary battleground for testing new features for open source Schemathesis and eventually add them to the project itself.
Now this project is in active development and will be available soon.
Full-featured GraphQL support
There are already some in-progress branches to make GraphQL a first-class citizen in Schemathesis. Integration with CLI will be available in the next few months.
Stateful testing improvements
As the next step, I’d like to integrate the work done by Patrik Segedy and make Schemathesis able to automatically infer connections between different endpoints and run tests that can reach much deeper to your codebase than before without any changes to the API schema.
Complete support for recursive schemas in Open API
Schemathesis supports recursive schemas, but in some edge-cases, they may cause significant performance degradation. There is already a proof-of-concept pull request to remove this limitation and make even complex Open API schemas easily testable.
Negative testing
Schemathesis generates data that fits the input API schema, which already covers a lot. Having test cases that intentionally contain data that is invalid will help uncover even more errors. I implemented a draft some time ago but didn’t manage to push it to a usable stage - this feature is close to the top of my TODO list.
Besides these features, I plan to focus on improving documentation and providing more usage examples.
If you have something on your mind, please, let us know, and we can make it better.
I also plan to write small development updates each month and create a roadmap in our GitHub repo to keep the project development transparent.
If you want to contribute to Schemathesis, then check out our issues board. There are a lot of tasks to work on. Some of them are marked with the "Hacktoberfest" label and have detailed implementation notes inside.
Conclusion
I hope that I shed some light on what Schemathesis is, what it might look like in the future, and how it could be useful in API development workflows.
If you have any questions - feel free to contact me on Twitter or via email.
An important point to make is that recently I left my full-time job and decided to work as an independent software consultant and spend my time working on Schemathesis and its ecosystem. You can hire me if you need to set up or improve your testing workflows or if you want to integrate Schemathesis into it.
If you or your company uses Schemathesis and wishes to support the project, please, consider funding my work on GitHub sponsors. It will help me to develop new features faster and spend more time supporting users.
Big thanks to Zac Hatfield-Dodds and Alexander Hultnér for their help with making Schemathesis better.
Sincerely, Dmitry
❤ ❤ ❤