Step 2: Install Dependencies
Now that you've generated the module, let's navigate to its directory and install dependencies:
cd ../../connectors/source-<name>
python -m venv .venv # Create a virtual environment in the .venv directory
source .venv/bin/activate # enable the venv
pip install -r requirements.txt
This step sets up the initial python environment. All subsequent python
or pip
commands assume you have activated your virtual environment.
Let's verify everything is working as intended. Run:
python main.py spec
You should see some output:
{"type": "SPEC", "spec": {"documentationUrl": "https://docsurl.com", "connectionSpecification": {"$schema": "http://json-schema.org/draft-07/schema#", "title": "Python Http Tutorial Spec", "type": "object", "required": ["TODO"], "properties": {"TODO: This schema defines the configuration required for the source. This usually involves metadata such as database and/or authentication information.": {"type": "string", "description": "describe me"}}}}}
We just ran Airbyte Protocol's spec
command! We'll talk more about this later, but this is a simple sanity check to make sure everything is wired up correctly.
Note that the main.py
file is a simple script that makes it easy to run your connector. Its invocation format is python main.py <command> [args]
. See the module's generated README.md
for the commands it supports.
Notes on iteration cycle
Dependencies
Python dependencies for your source should be declared in airbyte-integrations/connectors/source-<source-name>/setup.py
in the install_requires
field. You will notice that a couple of Airbyte dependencies are already declared there. Do not remove these; they give your source access to the helper interfaces provided by the generator.
You may notice that there is a requirements.txt
in your source's directory as well. Don't edit this. It is autogenerated and used to provide Airbyte dependencies. All your dependencies should be declared in setup.py
.
Development Environment
The commands we ran above created a Python virtual environment for your source. If you want your IDE to auto complete and resolve dependencies properly, point it at the virtual env airbyte-integrations/connectors/source-<source-name>/.venv
. Also anytime you change the dependencies in the setup.py
make sure to re-run pip install -r requirements.txt
.
Iterating on your implementation
There are two ways we recommend iterating on a source. Consider using whichever one matches your style.
Run the source using python
You'll notice in your source's directory that there is a python file called main.py
. This file exists as convenience for development. You run it to test that your source works:
# from airbyte-integrations/connectors/source-<name>
python main.py spec
python main.py check --config secrets/config.json
python main.py discover --config secrets/config.json
python main.py read --config secrets/config.json --catalog sample_files/configured_catalog.json
The nice thing about this approach is that you can iterate completely within python. The downside is that you are not quite running your source as it will actually be run by Airbyte. Specifically, you're not running it from within the docker container that will house it.
Run the source using docker
If you want to run your source exactly as it will be run by Airbyte (i.e. within a docker container), you can use the following commands from the connector module directory (airbyte-integrations/connectors/source-python-http-example
):
# First build the container
docker build . -t airbyte/source-<name>:dev
# Then use the following commands to run it
docker run --rm airbyte/source-<name>:dev spec
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-<name>:dev check --config /secrets/config.json
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-<name>:dev discover --config /secrets/config.json
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/sample_files:/sample_files airbyte/source-<name>:dev read --config /secrets/config.json --catalog /sample_files/configured_catalog.json
Note: Each time you make a change to your implementation you need to re-build the connector image via docker build . -t airbyte/source-<name>:dev
. This ensures the new python code is added into the docker container.
The nice thing about this approach is that you are running your source exactly as it will be run by Airbyte. The tradeoff is iteration is slightly slower, as the connector is re-built between each change.