Hacker News new | ask | show | jobs
by charlesdaniels 2171 days ago
This sounds really cool!

However, I wonder about the reliability aspects of it? Many sites are pretty hostile to Selenium and robot-like behaviors. Also, software can change it's layout, colors, icons, etc.

I've seen this with a team that was using SikuliX[0] to automate testing of a GUI application. It worked, but any time the UI changed even a little, the whole automation program would have to be re-done.

In short, I wonder how ElectroNeek solves these issues:

1. Working around software that has been designed to be hostile to automation.

2. Coping with changes to a UI when a software is updated.

3. All the difficult little error handling bits. How do you know if the workflow succeeded? How do you make sure it only happens once? How do you notify the user?

4. Where does the program run? Do you have to have a pile of PCs sitting around running mouse movement scripts in real-time?

Another thing I wonder -- how do the users feel about this? I think many people would be uncomfortable knowing that there is a Sword of Damocles hanging over their head, waiting to automate their job away if it seems too repetitive. I guess that would ultimately be a cultural issue for the company using this tool to solve, but still something I think would be important in order for clients to be successful using it long-term. For example, you don't want to create an incentive for people to try and make their tasks seem non-repetitive in order to avoid having their job automated.

0 - http://www.sikulix.com/

3 comments

Thank you Charles,

1. Interface elements on windows apps and websites typical have identifiers that allow to hook to them correctly even if visual representation/layout changes so we were ablate automate pretty much any software/website we have tried. Some websites use active A/B testing all the time and theta becomes a challenger, though, doable.

2. This is a fundamental challenge of GUI-level automation, were is no silver bullet to completely eliminate the need for bot 'maintenance'. We allow users to create a library of UI elements that they are using in their workflows, again, their identifiers often allow automatically handle UI changes but if not, users can just relink their interface library with changed GUI elements, without disrupting the bot logic, so it becomes a minor effort to see the bot working in changing GUI environment.

3. (1) Real-time tracking of what the bot does, (2) Logs, (3) 'Exception' port to build custom logic for user involvement/notification at every step of the automated process. For instance, the bot can send you Slack message or email if it ran into an issue.

4. It can run on end-user computers, will look like a cross-application macro, or on a virtual machine/server 24/7 ('unattended' process automation). We partner with Microsoft to bundle our software with Azure infrastructure and make it scalable - so you can deploy more specific bots when the workload for them goes up.

5. You are right, this is a cultural aspect and people challenge. In my own experience, the best way to address is to let people who got a few hours back due to routine elimination to share their experience and how their career path has changed since that, for instance they picked up some analytical tasks that otherwise would require a new hire.

I use this in the wild and like anything it has its pros and cons.

The best usage is automating repetitive tasks in a call center environment. Like you click a "Start my day" button and it opens up and logs you into every application that you use. Then a customer calls in and you put their information in one place and that opens the customer's record in all of those applications. Time is money in call centers and you can save a lot of time with simple stuff like this.

More questionable usage is as a psuedo-api background process. The users fill out a form and that information is sent over to a queue. The robot pulls it out of the queue and then enters it into the slow/confusing legacy backend system. This is good because it's cheaper/faster than building out an API to a legacy backend that is difficult and expensive to change.

The problem is that now you have a brittle and asynchronous communication layer over a legacy backend that is difficult and expensive to change. You compound the problem you have in exchange for quick benefits.

Yes, call center is a great example, as any customer order/processing center that uses many non-integrated IT solutions or where APIs don't provide enough flexibility/customization required workflows to run
I worked on 2 a little bit. Serializing the dom then putting it into probabilistic data structures is the way to go. Bloom filters, lsh hashing, min-sketch are good ways to fuzzy match. UI’s change but for the most part code doesn't.