Hacker News new | ask | show | jobs
Ask HN: How do you search for $string if a webpage “supports” infinite scroll?
101 points by zeta0x10 2307 days ago
Let's say you visit a webpage like Reddit and want to search for $string. Maybe I am too stupid, but in my opinion it's just not possible anymore. :( Content is dynamically loaded in and out and you have to scroll carefully not to miss a $string when using browser based CTRL-F. That is just ridiculous.

For me infinite-scroll is one of the most stupid features of the "modern" web. It just makes the experience worse to crank up some dubious engagement numbers.

And to make it clear: I don't want to search for stuff via $string site:reddit.com via some search engine. Often I really want to search for exact $string on a page, but on something like Reddit that does not work anymore.

[/rant]

EDIT: Thanks for all the answers. And I think I hit a nerve here :) Maybe it makes some frontend developers take a step back and really think if it's really a good idea to implement that "feature".

As suggested you can use old.reddit.com in case of Reddit, but for some pages, there just isn't an option and the worst offenders even hijack your CTRL-F and want you to use their own terrible search.

32 comments

I see lots of replies suggesting scrolling for a long time and then using CTRL+F.

However, this won't work if the page is using virtualised scrolling (common with React et al. SPA for performance reasons, to avoid huge DOM trees as the page expands). The majority of content that is outside of the visible window will simply be unmounted from the DOM.

I'm not sure what the best-practice for a webapp designer is here? Perhaps intercepting Ctrl+F and displaying a custom search that will do the correct filtering on the back-end and update/retarget the view? Azure DevOps does this but it's still frustrating if your focused element is not within the capture point for the event.

> I'm not sure what the best-practice for a webapp designer is here? Perhaps intercepting Ctrl+F

Oh god no, the solution is simple stop reimplementing the browser in the browser.

Google Drive does this within Google Chrome, fwiw. I don't disagree with the spirit of your comment though.
I'd say Google and Algolia are the only companies with the resources and know-how in search that can even think of this. And probably even they shouldn't.
The result can give a much better UX if done properly. (If done poorly it's just frustrating of course...)

Just think of SPAs vs normal backend templated apps. The SPA can be much faster with poor network connection or when the data needed is large but unpredictable. Then the SPA can really shine, if routing is (re-)implemented well enough.

What about MX (machine experience)?
I would agree, if the opposite didn't make the page unusable. Infinite scrolling doesn't really work if you don't unload what's on the screen (at least it didn't the last time I tried it, in 2014).
That's a horrible UX.

A better practice would be to avoid infinite scroll and better yet to think carefully before using a full-blown SPA for a standard UI. If it's a web page as opposed to a web app, there are certain UX features that every browser supports out of the box and many, many users with keyboards will expect.

> Perhaps intercepting Ctrl+F

Please don't ever do that. I know one site that does this and I hate it.

Google Docs does this, e.g. in a spreadsheet. I think it's reasonable in that context.

For what it's worth, if you really need to search e.g. "View" to figure out where the View menu is, the OS-native menu option still works.

For applications that's fine. I'm mostly refering to traditional websites/forums that should never do this.
Discourse [1], popular forum software, does this as well.

[1] https://www.discourse.org/

Not a fan either, but at least it allows you to repeat the keystroke to do a native page search.
This is infuriating. Every time I CTRL+F on Discourse I get upset.
At one point I had a user script that prevented certain key combinations from being intercepted. Worse than discourse is a hosting platform I was encountering frequently for a while, where pressing esc to stop the page from loading instead caused the website to switch pages to a login prompt for the site owner. Other offenders that annoy me include intercepting alt+d (focus address bar), Ctrl+n (new window, outlook captures this), Ctrl+t (new tab), Ctrl+a (select all), and other common keyboard functions.

Fix: Just add a dummy onkeyup function to the document root and body elements, and set @match rules for whatever sites annoy you. Occasionally you'll also need the more generic event handler function as well.

It shouldn't be possible. That the browsers allow it is the real problem. Same with forced unmounting of resources from the DOM.
1. Open devtools. Disable cache. Reload the page. Go to sources panel. CTRL-SHIFT-F. Enjoy the JSON response.

2. Alternative: same but with web debugging proxy like Fiddler or Charles.

Yep. For JS-heavy pages with virtual scrolling (i.e. removing from DOM stuff that is out of viewport) there's no better solution.

that's what discourse does. it's frustrating because it's significantly slower than the built-in browser search, probably in part due to excessive network accesses, but also probably due to poor caching and inefficient implementation.
I hate discourse's CTRL+F. When things like this happen I open the devtools and search in the dom, and it's a horrible experience too. I just don't want to learn each website's custom controls. Don't hijack my CTRL+F or my scrolling or anything else please!
The custom UI is also IMO terrible. It's completely different than the browser Ctrl-F UI. It does not simply highlight and move to the matches. Instead it shows a "preview", so you have to navigate to each match separately and back to the search UI. And it just navigates to the matching comment, but for very long comments I still have no idea where in the comment the match is. Thankfully you can press Ctrl-F twice to get the browser UI.
The solution is to stop re-writing basic browser functions. Just let me scroll.
Google Docs implements custom search. Unless web frameworks start to expose some basic controls to the user, like pre-walled-garden Operating Systems, that's what we have to wait for
GitHub does this in workflow logs and it makes it impossible to copy/paste logs greater than the number of lines in your view.

Then you download the raw logs and every line has a timestamp so you need to write a shell command just to parse out the actual text you wanted to copy.

They also bind to the "/" to focus on their searchbox, which breaks the native behavior in FF (which would be a single-keystroke for Find in Page, at least meta-f still works). Google likes to do this too, e.g. Google Tag Manager.
> Perhaps intercepting Ctrl+F and displaying a custom search that will do the correct filtering on the back-end and update/retarget the view?

Stripe does this in their docs and I find it absolutely infuriating.

You can also use Ctrl+G
For reddit specifically, the answer seems to be to use "old.reddit.com" instead of "www.reddit.com".

The is no general solution. Infinite scroll style web apps are implementing their own content view, in essence a web browser inside a web browser. It will never behave as users expect.

This is a good answer that works in reddit, but if you change the user agent, it will work in almost any website with support for ie.

IE6 User Agent:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

https://addons.mozilla.org/es/firefox/addon/uaswitcher/

This can be done with Selenium too.

https://stackoverflow.com/questions/29916054/change-user-age...

You can set a custom UA in Firefox without an extension by going to about:config and adding the new UA as a string called `general.useragent.override`.
Yes you can, but with the addon it is easier, and you can set your custom user agents and store within, it is just for comfort.
This add-on for Firefox is helpful for that (redirects all www to old): https://addons.mozilla.org/en-US/firefox/addon/old-reddit-re...

I presume this is the corresponding Chrome extension: https://chrome.google.com/webstore/detail/old-reddit-redirec...

I personally prefer the interface of "i.reddit.com", but they semi-frequently let the cert expire for that one.
Other problems:

- Scroll, scroll, scroll, and then click a link. Now click back. You have lost your place.

- How can anyone link to a section far down?

- Impossible to read or use anything in the footer. I suppose if you install infinite scrolling, you remove the footer. But I have heard a story where they forgot.

Twitter seems to have the back thing down somehow. I don't know how it works (my best guess is making a <div style=height:scrollTop-200px> and only loading a few tweets around the scroll position and making that work with bfcache) but so it is apparently possible to make that work fine.

Linking to things in the middle of a page was solved in the 90s with the hash part of the URL. Aside from legal terms or privacy policies, I haven't yet seen a site with excessively tall pages that don't have some mechanism (hash part or direct linking an entry) of linking to in-the-middle content.

Most sites with infinite scroll don't have a footer, though I have seen a few. Inspect element works well enough there imo, if you're tech savvy of course.

> Linking to things in the middle of a page was solved in the 90s with the hash part of the URL.

And then promptly broken again in the mid-00s more or less till now, by way of sub-par client side routing, giving rise to SPAs and breaking back buttons, history management and deep linking all over the place. Things are getting slightly better since the history api came about, but you still see loads of sites with non-existent or broken deep linking, and crappy history management, such as reddit. This is of course because the browser behavior of fragment links, history etc. on static content is well defined, but for dynamic content in most cases have to be considered and implemented in the application logic, and it’s never a priority.

Hands up everyone who’s been in planning sessions for your project and right from the start you’ve considered deep linking and history management! Anyone..? No, ok.

Truth is, in my near 15 years of web dev experience, I’ve never once seen this be part of the requirements, because you kind of just expect it to work. Yet it’s incredibly easy to break, unintentionally even, and three weeks before release it’s usually incredibly difficult to fix because of all the corners we’ve painted ourselves into. So then it becomes a thing you fix after the fact, and spend tons of time tryin to figure out how to wrangle your router or framework or what have you to work it out, but turns out you’ll probably have to rewrite it all. But of course you won’t, because it’s silly, so it just becomes one of those things you paper over where possible and quietly ignore otherwise... sigh

Sorry for the rant.

This. So much this. Infinite scrolling is one of the worst UI designs ever. Plus most I've seen are poorly implemented and buggy, which just makes it that much worse.
First two are somewhat easily solve-able, yet it involves JavaScript.

Sadly, most devs know how to implement an infinite scroll, but lack the skill to implement it properly.

Footer — there should be none on an infinitely-scrolled page, but that depends on the layout.

You'd have the same problem with pagination, you'd need to Ctrl+f on each page.

Solution? Use the website's search form.

They're similar, but at least with traditional pagination you have a very clear page transition boundary. You can manually loop through Next → Ctrl-F repeatedly much faster in my experience. With infinite scroll I'm left to work out my own "pages" by eye or some other mechanism, which is slower. Not by a lot, but still slow enough that I feel the friction.
Pagination is still preferable to infinite scrolling in my opinion, especially if it is reflected in the URL. Never actually found a positive example of infinite scrolling. It might be good for activities that are unfocused, but for anything else it is misplaced. For browsing images I could imagine use cases. But not for an archive on the other hand.
I agree with this. I feel like infinite scroll gets a bad rap. Pagination is no better. There are some sites and use cases where infinite scroll is a huge usability improvement.

(I know -- pagination gives you a URL to a specific page of results. Who cares? When was the last time you deep-linked into a specific index page? Doing so is probably a bad idea, since when new content gets added, the contents of a specific index pagination will change.)

Pagination is a lot better. In the example presented, most of the time the user just wants to find something on the front page. Nobody uses ctrl-F to search all of reddit.
The "front" page would be the default load of an infinite scroll, before anything has been added. Pagination or infinite scroll -- the initial load is likely the same.
> Doing so is probably a bad idea, since when new content gets added, the contents of a specific index pagination will change

unless you paginate in chronological order rather than reverse chronological. but no one does.

It's easy enough. Just request the next X results after the last result on the current page, like so: `?resultsperpage=50&resultsafter=postid750`. If you're not sorting by time, then you should be able to add a time constraint to the search backend and include that parameter as well. I've also seen forums that cache the results for a certain amount of time, and only return the search ID and page number in the query string.
Yeah, but than I don't miss anything. I very much prefer it to infinite scroll, where it's possible to scroll "too fast" and miss content ...
Infinite scroll is one of those things that needs to die.
Nope, I love infinite scroll as a user. I even use extensions to add it to paginated websites. I wish those websites had infinite scroll as default or as an option cause infinite scroll extensions can be jankey.

Not saying there aren't bad implementations of infinite scroll out there, those do need to die off, but they aren't that common.

I hate it as a user, it needs to die because it makes my user experience so much worse. I've never seen a not-broken implementation. Suddenly, the end key cannot take me to the bottom of the page, using the browsers native search function (which I will always chose over an in-page alternative) cannot do its thing, I can no longer get an idea of how far along I am in the list, and I cannot do a quick visual scan of many items because dragging the scrollbar does not behave naturally. Page up and page down is all but broken.

A compromise may be a browser option that one can set globally to indicate whether one wants pagination (and how many results per page) or infini-scroll.

I made a Firefox addon for myself that applies CSS to elements with text and/or attribute values that match a given regular expression. One of the first issues I ran into was that a lot of pages build their content via javascript so I added a delay to the scan. That mostly worked but the next issue (obvious in hindsight) was that can happen at any time, multiple times, for various reasons. That's when I learned about MutationObserver for the first time and it's worked pretty much perfectly since.

Here's the addon source [0]. I mostly (only) use the addon to style elements with links to sites I know I never want to visit, but it should work for this type of thing. However, the current UI isn't convenient for adding an adhoc rule.

[0] https://github.com/7w0/ssure/blob/4fd34677ad1c3f667ae85b939f...

What I have done in the past:

Scroll down a couple of days/pages/whatever and then use Ctrl+F. It‘s annoying as hell, but that usually works as most pages just add stuff at the bottom and don‘t unload the previous content on top

As mentioned above, there are some sites that page out the content that is no longer in the viewport after you've gone suitably far from it.

I still remember my consternation when I first noticed this, and it was because I tried exactly what you had suggested, but a previous match further up the page disappeared upon subsequent CTRL+F-ing.

If reddit is your only use case, there is always https://old.reddit.com/ ... until they take it down.

I've see web apps with infinite scroll that will capture your Ctrl+F keypress and provides their own in-page text search tool (ex: https://i.imgur.com/BJPDDFw.png) . I don't find it to be as easy/natural to use than the browsers build in text search, but it's better than broken text search.

No, I prefer a broken text search. You can have a custom search, I'm fine with it, but don't overwrite Ctrl+F
Yes, I agree with you there. Don't overwrite my browser's shortcut keys.
You can paste something like this into your javascript console:

setInterval(function() { window.scrollTo(0, document.body.scrollHeight); }, 2000);

That will scroll continuously, so you can let it run for a while and then do ctrl-F. It might take a while but you can do something else. I've used it successfully many times (usually when I wanted to scrape the DOM for some reason or another, rather than just do a search).

I've tried this twice in the last year or so: once on twitter and once on dropbox. Neither time did it work. It _seemed_ that ctrl+f would not find anything that was not visible +/- a screen height or two. I couldn't tell if it was a shortcoming of Firefox (unlikely IMO) or both sites tried to be clever about dynamically unloading content when it was outside the view window (more likely).
Seems to be a Firefox thing. Works fine on Chromium browsers, only finds stuff near the top in Firefox.
Or just put something heavy on your keyboard's "End" key :)
Ha yeah that works, if you don't want to do anything else on your computer for a while as it does its thing....
The naive way is to scroll down a ridiculous amount and then do a simple ctrl-F. It would be nice if infinite-scroll sites provided their own search function that searched through all potential content for that page.

(And then probably provide those search results on an infinite-scroll page, requiring another search function to search through those search results. It's infinite scroll all the way down.)

It's exactly after realizations like this that one (unreasonably, to be very explicit) decides all the GUIs are leaky abstractions and should be avoided.

Then you want to do some math and realize how amazing visual spreadsheets are.

What I mean is, there's no perfect solution and it all depends on what you're trying to achieve.

I’ve been finding it increasingly difficult to find blog templates that don’t do infinite scrolling with no way to turn it off. It’s not impossible, and most blog themes still have an option to turn it off if they’re included, but I’ve noticed it’s increasingly difficult to find themes without it. It seems to be the cool thing to do. But for setting up a static blog using the template, it’s a non-starter. Plus for all the aforementioned reasons: CTRL+F, linking to content, following a link and then going back, SEO (although some do SSR — still this is a non-starter for a statically published blog that just generates the pages using the theme)...
I open dev tools (or dev proxy) to see all the connections the site makes, grab the url to the "page" AJAX call. Put ' (single quote) into the URL which then returns a SQL error. Then I make a SQL injection to make the page return everything. And search that.
Rendering with JS causes many difficulties. It often simply makes the content less useful - how do we index such a page for efficient discovery (search) for example. Running JS locally to render pages is problematic and often does not work.

I cannot tell you how many times I have gone to commerce sites that use infinite scroll - scroll down - click on and look at a product - back button to listing page returns to top of scrolling page making me scroll down again. What a waste of time.

Hello friend, this is pretty simple, just change your user agent for this one:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

It's ie 6 user agent with no support of js, so bye bye infinite scroll.

I cannot think of one single use case where infinite scroll makes sense compared to plain old data tables that are paginated with server side search/pagination built in.
Great for if you want your users to forget about how long they are spending on your site. Less friction and no jarring page loads, just a steady drip of dopamine.
I recall a time I scrolled through the whole page while using Fiddler or Burp as a proxy and then just searched the logged requests. Not pretty at all but it worked.
reddit has a JSON api you can access with your credentials.
More and more I feel like we should be simulating a human by macroing an actual typical web browser, including macroing the browser's dev tools.

(When scraping is fragile and you may have to periodically maintain scripts anyway, this is my conclusion after trying to automate a React website using recently-mentioned lib taiko, and finding it to be hit & miss, although it probably speaks more to my inexperience in general.)

This is a real interesting rant. I've definitely had this problem and wondered about the benefits of infinite scroll at those times. Also bad UX design.

I can't offer anything besides the fact that I love my mouse which has this free roll toggle button. When I press that I can roll the wheel really fast and long without anything slowing it down.

But it's sad that certain websites force users to such measures.

I haven't tested if this works for Reddit for some sites, if you use Opera, somehow they show you an 'old' version of the site that doesn't have auto-scrolling. I'm not sure why - perhaps it's because there's some kink in Opera that doesn't support it, but it's worth a try?
Maybe try google with `site:` keyword. You can use also advance searching tools, so eg. you can search value to date.
> And to make it clear: I don't want to search for stuff via $string site:reddit.com via some search engine
1. Click somewhere on the page so the scrolling works

2. Hold down Ctrl + End until a desired amount of pages loads

3. Ctrl + F to find $string

That's what I used to do but sites like Twitter unload content to not make your browser tab super slow (which is a good thing in general), which breaks ctrl+f.
Which is useless if elements outside the viewport are garbage collected because you are browsing a virtual list. Instagram is an example but they don't have text in the elements so it's not a huge hindrance.
Seems like there’s opportunity for an extension to grab all text as it appears, store in client side DB, search that. I am going to build that as a feature in an extension I am working on to track all of my upvotes as automated bookmarks.
Unless there's a mobile site, I usually do this: I use my Logitech MX Master 2s, hit the scroller really hard to let it spin for a while thanks to inertia, triggering as many "page" events as possible :-)
I had this exact problem yesterday while looking for someone in my list of Twitter people that I follow.

Also on YouTube when searching for something on my list of liked videos.

If you want to implement infinite scroll at least implement a damn search.

Hey, just like when you want to get to all the information in the footer like Contact Us, but the moment it appears, it gets scrolled away. I feel like a cat chasing a laser pointer.
On another note, infinite scrolling pages start using more and more memory as you scroll, which eventually requires closing the browser tab.
Discourse (forum) used to hijack Ctrl+F (Command+F) shortcut which makes it super irritating! Not sure if they have changed that.
That is only done if you are in a topic page where the amount of posts is higher than 20 replies.

In that case Ctrl+f is redirected to a search that is scoped to the current topic and can search it all.

Other pages don't have the search hijacked, and of they do you can press Ctrl+f twice to open the browser search.

last i tried discourse was in 2016. things may have changed.
Hello friend, yeah, there is an option, just change your user agent, tested in several sites and works...
Use the search box, or figure out their API and use it directly.
switch to old reddit?

facebook, twitter etc have decided that their content is so low quality it's not even worth searching so they re just not supporting it

Go to Inspect element and do control + F to search for the string.
This still wont solve for lazy loading data on scroll, which is actually how its supposed to be done, to reduce load time, and dom draw together.
I don't understand what you mean but if the infinite scroll content has already been loaded, the HTML for the same will update in the element inspector as well and you can search for the string there. Of course it's hacky but I use this technique all the time.
That assumes the site doesn't do something funny with tags to get around the user's desire to not risk infection from malvertisements.
This is no different than scrolling down and ctrl+F. Infinite scroll doesn't load content until you scroll down.
Ok I was assuming they meant that if they did control + F on the webpage and the string was towards the bottom area, the page would start infinite scrolling automatically after locating that string via search(and they wanted to avoid this). Heres a similar example, if I wanted to get the contact page link(located on the footer) on an infinite scroll page, I would never be able to get to the bottom because of infinite scrolling. But I could do a Ctrl +f on the inspector to locate the "contact" string and copy the contact page link from there.