Yeah, like the other reply, I'd mostly say testing DAGs was an issue. Airflow-related configuration is easy to get wrong and it silently fails a lot.
Now that I think about it though, most of the time I spent on testing wasn't caused by Airflow. Testing data pipelines just isn't easy with the current well-known tooling.
Probably DAGs - Operators can be tested from their hooks, but in my experience testing a DAG is annoying - I usually just make a copy that does a dry run/runs with test data, or just test in a local airflow container as it's much faster.
Now that I think about it though, most of the time I spent on testing wasn't caused by Airflow. Testing data pipelines just isn't easy with the current well-known tooling.