Install Airflow in your local Macbook

  • 0

Install Airflow in your local Macbook

****************************** Step 1 *****************************

Create a new airflow directory anywhere in your laptop

(base) saurabhkumar@Saurabhs-MacBook-Pro spark-3.1.1-bin-hadoop2.7 % cd ~/Documents

(base) saurabhkumar@Saurabhs-MacBook-Pro Documents % mkdir airflow-tutorial

(base) saurabhkumar@Saurabhs-MacBook-Pro Documents % cd airflow-tutorial

 

************************** Step 2 *******************************

Create a python virtual env

(base) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % conda create –name airflow-tutorial1 python=3.7

Collecting package metadata (current_repodata.json): done

 

************************* Step 3 **************************

Activate python virtual env

(base) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % conda activate airflow-tutorial1

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % pwd

/Users/saurabhkumar/Documents/airflow-tutorial

 

*********************** Step 4 *****************************

export the absolute path

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % export AIRFLOW_HOME=/Users/saurabhkumar/Documents/airflow-tutorial

 

************************* Step 5 *****************************

Now install airflow 1.10.10

 

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % pip install ‘apache-airflow[gcp,statsd,sentry]’==1.10.10

Collecting apache-airflow[gcp,sentry,statsd]==1.10.10

Downloading apache_airflow-1.10.10-py2.py3-none-any.whl (4.7 MB)

|████████████████████████████████| 4.7 MB 554 kB/s

Collecting pandas<1.0.0,>=0.17.1

done..

install successfully

 

************************* Step 6 *********************************

Install SQLAlchemy==1.3.23 and Flask-SQLAlchemy==2.4.4

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % pip install SQLAlchemy==1.3.23

Collecting SQLAlchemy==1.3.23

Downloading SQLAlchemy-1.3.23-cp37-cp37m-macosx_10_14_x86_64.whl (1.2 MB)

|████████████████████████████████| 1.2 MB 2.5 MB/s

Installing collected packages: SQLAlchemy

Attempting uninstall: SQLAlchemy

Found existing installation: SQLAlchemy 1.4.9

Uninstalling SQLAlchemy-1.4.9:

Successfully uninstalled SQLAlchemy-1.4.9

Successfully installed SQLAlchemy-1.3.23

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % pip install Flask-SQLAlchemy==2.4.4

Collecting Flask-SQLAlchemy==2.4.4

Downloading Flask_SQLAlchemy-2.4.4-py2.py3-none-any.whl (17 kB)

Requirement already satisfied: Flask>=0.10 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Flask-SQLAlchemy==2.4.4) (1.1.2)

Requirement already satisfied: SQLAlchemy>=0.8.0 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Flask-SQLAlchemy==2.4.4) (1.3.23)

Requirement already satisfied: Werkzeug>=0.15 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Flask>=0.10->Flask-SQLAlchemy==2.4.4) (0.16.1)

Requirement already satisfied: Jinja2>=2.10.1 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Flask>=0.10->Flask-SQLAlchemy==2.4.4) (2.10.3)

Requirement already satisfied: click>=5.1 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Flask>=0.10->Flask-SQLAlchemy==2.4.4) (7.1.2)

Requirement already satisfied: itsdangerous>=0.24 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Flask>=0.10->Flask-SQLAlchemy==2.4.4) (1.1.0)

Requirement already satisfied: MarkupSafe>=0.23 in /Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages (from Jinja2>=2.10.1->Flask>=0.10->Flask-SQLAlchemy==2.4.4) (1.1.1)

Installing collected packages: Flask-SQLAlchemy

Attempting uninstall: Flask-SQLAlchemy

Found existing installation: Flask-SQLAlchemy 2.5.1

Uninstalling Flask-SQLAlchemy-2.5.1:

Successfully uninstalled Flask-SQLAlchemy-2.5.1

Successfully installed Flask-SQLAlchemy-2.4.4

 

************************ Step 7 *******************************

initialize the airflow database

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % airflow initdb   

DB: sqlite:////Users/saurabhkumar/Documents/airflow-tutorial/airflow.db

[2021-04-18 21:15:50,507] {db.py:378} INFO – Creating tables

INFO  [alembic.runtime.migration] Context impl SQLiteImpl.

INFO  [alembic.runtime.migration] Will assume non-transactional DDL.

INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, current schema

INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 1507a7289a2f, create is_encrypted

/Users/saurabhkumar/opt/anaconda3/envs/airflow-tutorial1/lib/python3.7/site-packages/alembic/ddl/sqlite.py:44: UserWarning: Skipping unsupported ALTER for creation of implicit constraintPlease refer to the batch mode feature which allows for SQLite migrations using a copy-and-move strategy.

“Skipping unsupported ALTER for ”

INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 13eb55f81627, maintain history for compatibility with earlier migrations

INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 338e90f54d61, More logging into task_instance

INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 52d714495f0, job_id indices

INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 502898887f84, Adding extra to Log

INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 1b38cef5b76e, add dagrun

INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 2e541a1dcfed, task_duration

INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 40e67319e3a9, dagrun_config

INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 561833c1c74b, add password column to user

INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, dagrun start end

INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, Add notification_sent column to sla_miss

INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> bba5a7cfc896, Add a column to track the encryption state of the ‘Extra’ field in connection

INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 1968acfc09e3, add is_encrypted column to variable table

INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 2e82aab8ef20, rename user table

INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 211e584da130, add TI state index

INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 64de9cddf6c9, add task fails journal table

INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> f2ca10b85618, add dag_stats table

INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 4addfa1236f1, Add fractional seconds to mysql tables

INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> 8504051e801b, xcom dag task indices

INFO  [alembic.runtime.migration] Running upgrade 8504051e801b -> 5e7d17757c7a, add pid field to TaskInstance

INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> 127d2bf2dfa7, Add dag_id/state index on dag_run table

INFO  [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> cc1e65623dc7, add max tries column to task instance

INFO  [alembic.runtime.migration] Running upgrade cc1e65623dc7 -> bdaa763e6c56, Make xcom value column a large binary

INFO  [alembic.runtime.migration] Running upgrade bdaa763e6c56 -> 947454bf1dff, add ti job_id index

INFO  [alembic.runtime.migration] Running upgrade 947454bf1dff -> d2ae31099d61, Increase text size for MySQL (not relevant for other DBs’ text types)

INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 0e2a74e0fc9f, Add time zone awareness

INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 33ae817a1ff4, kubernetes_resource_checkpointing

INFO  [alembic.runtime.migration] Running upgrade 33ae817a1ff4 -> 27c6a30d7c24, kubernetes_resource_checkpointing

INFO  [alembic.runtime.migration] Running upgrade 27c6a30d7c24 -> 86770d1215c0, add kubernetes scheduler uniqueness

INFO  [alembic.runtime.migration] Running upgrade 86770d1215c0, 0e2a74e0fc9f -> 05f30312d566, merge heads

INFO  [alembic.runtime.migration] Running upgrade 05f30312d566 -> f23433877c24, fix mysql not null constraint

INFO  [alembic.runtime.migration] Running upgrade f23433877c24 -> 856955da8476, fix sqlite foreign key

INFO  [alembic.runtime.migration] Running upgrade 856955da8476 -> 9635ae0956e7, index-faskfail

INFO  [alembic.runtime.migration] Running upgrade 9635ae0956e7 -> dd25f486b8ea, add idx_log_dag

INFO  [alembic.runtime.migration] Running upgrade dd25f486b8ea -> bf00311e1990, add index to taskinstance

INFO  [alembic.runtime.migration] Running upgrade 9635ae0956e7 -> 0a2a5b66e19d, add task_reschedule table

INFO  [alembic.runtime.migration] Running upgrade 0a2a5b66e19d, bf00311e1990 -> 03bc53e68815, merge_heads_2

INFO  [alembic.runtime.migration] Running upgrade 03bc53e68815 -> 41f5f12752f8, add superuser field

INFO  [alembic.runtime.migration] Running upgrade 41f5f12752f8 -> c8ffec048a3b, add fields to dag

INFO  [alembic.runtime.migration] Running upgrade c8ffec048a3b -> dd4ecb8fbee3, Add schedule interval to dag

INFO  [alembic.runtime.migration] Running upgrade dd4ecb8fbee3 -> 939bb1e647c8, task reschedule fk on cascade delete

INFO  [alembic.runtime.migration] Running upgrade 939bb1e647c8 -> 6e96a59344a4, Make TaskInstance.pool not nullable

INFO  [alembic.runtime.migration] Running upgrade 6e96a59344a4 -> d38e04c12aa2, add serialized_dag table

Revision ID: d38e04c12aa2

Revises: 6e96a59344a4

Create Date: 2019-08-01 14:39:35.616417

INFO  [alembic.runtime.migration] Running upgrade d38e04c12aa2 -> b3b105409875, add root_dag_id to DAG

Revision ID: b3b105409875

Revises: d38e04c12aa2

Create Date: 2019-09-28 23:20:01.744775

INFO  [alembic.runtime.migration] Running upgrade 6e96a59344a4 -> 74effc47d867, change datetime to datetime2(6) on MSSQL tables

INFO  [alembic.runtime.migration] Running upgrade 939bb1e647c8 -> 004c1210f153, increase queue name size limit

INFO  [alembic.runtime.migration] Running upgrade c8ffec048a3b -> a56c9515abdc, Remove dag_stat table

INFO  [alembic.runtime.migration] Running upgrade a56c9515abdc, 004c1210f153, 74effc47d867, b3b105409875 -> 08364691d074, Merge the four heads back together

INFO  [alembic.runtime.migration] Running upgrade 08364691d074 -> fe461863935f, increase_length_for_connection_password

INFO  [alembic.runtime.migration] Running upgrade fe461863935f -> 7939bcff74ba, Add DagTags table

INFO  [alembic.runtime.migration] Running upgrade 7939bcff74ba -> a4c2fd67d16b, add pool_slots field to task_instance

INFO  [alembic.runtime.migration] Running upgrade a4c2fd67d16b -> 852ae6c715af, Add RenderedTaskInstanceFields table

INFO  [alembic.runtime.migration] Running upgrade 852ae6c715af -> 952da73b5eff, add dag_code table

Done.

 

********************* Step 8 ****************************

Start the webserver by typing airflow webserver

 

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % airflow webserver

____________       _____________

____    |__( )_________  __/__  /________      __

____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /

___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /

_/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/

[2021-04-18 21:16:15,198] {__init__.py:51} INFO – Using executor SequentialExecutor

[2021-04-18 21:16:15,199] {dagbag.py:396} INFO – Filling up the DagBag from /Users/saurabhkumar/Documents/airflow-tutorial/dags

Running the Gunicorn Server with:

Workers: 4 sync

Host: 0.0.0.0:8080

Timeout: 120

Logfiles: – –

=================================================================

[2021-04-18 21:16:17 +0530] [58546] [INFO] Starting gunicorn 19.10.0

[2021-04-18 21:16:17 +0530] [58546] [INFO] Listening at: http://0.0.0.0:8080 (58546)

[2021-04-18 21:16:17 +0530] [58546] [INFO] Using worker: sync

[2021-04-18 21:16:17 +0530] [58552] [INFO] Booting worker with pid: 58552

[2021-04-18 21:16:17 +0530] [58553] [INFO] Booting worker with pid: 58553

[2021-04-18 21:16:17 +0530] [58554] [INFO] Booting worker with pid: 58554

[2021-04-18 21:16:17 +0530] [58555] [INFO] Booting worker with pid: 58555

[2021-04-18 21:16:17,371] {__init__.py:51} INFO – Using executor SequentialExecu

 

 

 

********************** Step 9 ***********************************

open another Tab and start airflow schedular

 

(base) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % export AIRFLOW_HOME=/Users/saurabhkumar/Documents/airflow-tutorial

(base) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % pwd

/Users/saurabhkumar/Documents/airflow-tutorial

(base) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % conda activate airflow-tutorial1

(airflow-tutorial1) saurabhkumar@Saurabhs-MacBook-Pro airflow-tutorial % airflow scheduler

 

*********************** create user and password in airflow *********************

airflow users create -r Admin -u saukumar -e saurabhmcakiet@gmail.com -f Saurabh -l Singh -p saukumar


Leave a Reply