|
|
|
|
|
by askldjd
3274 days ago
|
|
You are dead on. We do have a bug where we are not recovering the Oracle connectivity correctly. It is on our radar to address the issue. https://github.com/department-of-veterans-affairs/caseflow-m... However, There is actually another 50% of the story that I never posted. VACOLS is a really old Oracle DB (from the 80s) that is out of our control. Somehow, it has a "feature" where you can only make one TCP connection to it every 2-3 second. So if we lose connection to the database, it will take many seconds to recover. At that point, our ELB health-check would've fired and restarted our EC2 instances. This is why recoverability of the database connection is not an immediate priority. Here's how we preallocate the VACOLS connection pool to workaround this throttling feature.
https://github.com/department-of-veterans-affairs/caseflow/b... The infrastructure we operate in are very challenging (and interesting) because of legacy systems. That's why common sense engineering often may not apply in USDS. |
|
Kudos on having something interesting to work on.