Hacker News new | ask | show | jobs
by Tombar 1274 days ago
Github actions FTW! > https://docs.github.com/en/actions/using-workflows/events-th...

I remember seeing a couple projects shared before, using this technique to scrape sites with GHA

5 comments

Just note that they’re not guaranteed to be called precisely on time, e.g. my “every 15m” CRON job will be called every 15m _at best_, in practice… twice per hour.

This works perfectly for my case (content syndication for https://potato.horse), and I’m pretty happy with GH actions for this kind of stuff, but if you need something more precise, you might want to look somewhere else.

> Just note that they’re not guaranteed to be called precisely on time, e.g. my “every 15m” CRON job will be called every 15m _at best_, in practice… twice per hour

Is the spread really that egregious? That's essentially a 50% failure to trigger at all, like I don't think you can call that 15 minutes so much as 15±10 minutes lmfao

IIRC AWS EventBridge is also not guaranteed to execute on the exact minute, but in my experience running a small job every 5 minutes only had about a 30-40 second delay at worst.

Just checked the logs: it used to be 2 per hour, but now it has improved to 3-5, so closer to the "every 15 m" rule. Again, not a big deal in my case and probably something that's being worked on.
At least for a while before I moved to self-hosted systemd timers, if you were running jobs near 0:00 UTC, the delay was often so long that your job wouldn't run at all. I had weeks of jobs literally not running at all before bailing on GitHub actions for this sort of thing. It was disappointing, but I'm now happier with my current setup.
https://simonwillison.net/2020/Oct/9/git-scraping/

Simon Willison has a bunch of examples of scraping sites with GHA and storing the results in a repo. But you can use the same technique without the storing part if need be.

AFAIK, scheduled GitHub workflows stop running after a while. But when that happens, GitHub will send you an email with a big green “Continue running workflow” button.
Actions won’t run on repos with no activity for 60 days [0]. So make sure your action commits something to the repo to keep up activity.

This seems to meet OP’s use case as they have to commit state anyway.

[0] https://docs.github.com/en/actions/managing-workflow-runs/di...

I use crons to keep my Docker containers fresh, and have never hit this. But the cron commits to the repo, so I wonder if they’re flagging repos with crons but no commits recently?
Definitely... It seems to be in 4-6 week timeframe... I'd thought about making the cron update references in the repo, but hadn't gone through that as of yet.
I've got a couple projects like this, that mostly just create bundles of other source projects that I'm not involved with. Creating a windows installer, or docker image for projects that don't have that integrated. It's kind of annoying that they will stop in several weeks when there's no project changes.
> It's kind of annoying that they will stop in several weeks when there's no project changes.

Can't you configure a GHA which commits nonsense every month?

Yeah... I will probably do that, with my daily check action (update a last-check) text file or something.
Can you not trigger an action via a local cronjob and the API?
But then I need a second system to do that.
sounds like time to make a “hit the big green button” github action
I wonder if I can make a Github repo with an action that commits to another repo on a cron. Or itself?
This works great! I do this in one of my projects if you need a reference.

https://github.com/patrickgalbraith/rageagain/blob/master/.g...

I came to suggest the same thing.

I use their cronjob functionality to ensure my docker images are built daily and therefore in theory secure.