UKC

Search function not bringing exact matches up first

New Topic
This topic has been archived, and won't accept reply postings.
 Andy Moles 17 Oct 2022

As title, I'm sometimes finding when I type in the exact name of a route in the logbook search bar, it's not coming up as the top suggestion in the drop-down list - it's way down somewhere beneath a bunch of other routes that share one word or something. Is there a function to show more popular things first?

In reply to Andy Moles:

Hey Andy. There's an update in the works that tweaks the algo a bit to improve this. I'll try get it out this week. 
 

Can you give me some example queries and tell me what the expected results are?

In reply to Andy Moles:

Edit: the more the better. The more example queries I have in the test suite the easier it is to tweak the algorithm (which is somewhat hairy) with confidence it's still working as intended. 

 Neil Foster Global Crag Moderator 17 Oct 2022
In reply to Stephen Horne - Rockfax:

Hi Stephen

Difficult to give examples, but I have definitely experienced what Andy is describing. In fact I messaged Alan about this exact issue, though that was some time ago (definitely prior to the new ‘all things’ search box introduction), and rerunning those previous dodgy searches just now indicates that the algorithm has improved results since my original report.

That notwithstanding, here’s an interesting result:-

I just ran a search for Bitterfingers, which as you know, is a classic route at Stoney Middleton. The results list puts that very route as the top suggestion, so no issues there. What strikes me as a little odd is the rest of the suggested list.  In at no. 40 in the list is a route at the New Zealand crag, Fantasy Factory. Given that this route is called, er, Bitterfingers, I’d have expected it to come in at no. 2 in the list of suggested possibilities.

What strikes me as even more odd is that suggestions 2 to 39 in the list are all random (well, perhaps not completely random, as I think most are in the broad vicinity of Bitterfingers) routes at Stoney Middleton, none of which have names that read anything like Bitterfingers…

It seems to me that your algorithm is perhaps trying to be too clever here. Given that the same search box is used for both route and crag searches, there could be an argument that returning Stoney Middleton (or indeed Fantasy Factory) in the suggested results, has a certain logic, but making the assumption that the list of suggestions should contain other random routes at an assumed crag, seems completely illogical to me.  It also fills the suggestion list with spurious entries which only serve to make spotting a logical answer, in the above example, the entry at no. 40 in the list, that much harder.

Hope that feedback is helpful, and best of luck fine tuning your algorithm!

Neil

In reply to Neil Foster:

Thanks Neil that helps a lot. I've got that added and fixed in dev.

In reply to Andy Moles:

Changes are live now so let me know if you hit any issues. Neil your bitterfingers one is working as you'd expect for sure. That was getting relegated because it has no ascents. I've now weighted an exact-name match higher than anything else.

This is done on a "string-contains" test, where it checks if your query contains the name of the route, so a query of "the rose scotland" contains the route name "the rose", but if you jangled the order of your query it won't work.

eg "rose the scotland" will not give a route name of "the rose" extra weight.

In reply to Andy Moles:

This update also promotes single grades to be considered as such, so a query of 

'8a malham'

will not be doing a string match with '8a', it will only return routes whose grade is exactly '8a' (ie no '8a+'s in the results)

Previously this was only possible by creating an awkward twin-grade-range like '8a-8a' or using the somewhat tedious 'g(8a)' syntax.

OP Andy Moles 18 Oct 2022
In reply to Stephen Horne - Rockfax:

Nice one, it's now working as it should for the example that prompted me to start the thread, I'll report back if I notice any anomalies.

 Neil Foster Global Crag Moderator 18 Oct 2022
In reply to Stephen Horne - Rockfax:

Thanks Stephen.

Yes, your change now puts the 2 climbs named Bitterfingers at the top of the list, correctly identifying and placing first, the one that a UK website search would most likely be looking for.

However, I’d still question the usefulness of then listing 38 random climbs at Stoney Middleton, none of which is called Bitterfingers?  If you really want to pad out your list of suggestions (given that you already have a perfect match), would it not make more sense to just list the 2 crags with a route called Bitterfingers, and leave it at that?

I did find a search which didn’t quite return the list in the order you’d expect when I searched for a route on Kalymnos named Yanap.  Again, a perfect match entry exists, but in this case there is a mountain (Yanapaccha) whose name starts with those same 5 letters, and it, rather than the perfect match, is at the top of the suggestions list.

Again, I see that the 2 routes on that mountain are included in the list, and whilst this particular list is of a manageable length, I’d still argue that (despite what Google would want us to believe) in the case of the UKC search algorithm, less is more and the suggestions list would be more helpful without these superfluous routes.

Here’s a final example to illustrate my point.

There’s a great route on the Cornice in Cheedale (at least it was once a great trad route!) called Fey.  Searching for Fey brings up the 2 (trad and sport) versions of that climb, but despite being a perfect match, those are only entries 2 and 3 in the suggestions list.

In at no. 1 in the list of suggestions is the delightfully named Swiss crag Sex du Parc aux Feyes.

This crag has 63 routes listed in the Logbook entry.  Those 63 routes (completely superfluous in my view, as explained above) then appear in the suggestions list, though this time they aren’t even all together, because one of the route names at the crag contains the letters Fee, which the algorithm considers (presumably) to be an alternate spelling to the search string, though why this is considered less likely than Goofey on Hell’s Lum, but more likely than Wifey at Rocklands is completely beyond me….!

What is perhaps more odd is that the famous route Fay at Sharpnose isn’t listed, on the same ‘potential misspelling’ basis. And try searching for Fay, for another example of much of what I’m describing here.

Anyway, back to the Fey search, looking further down the suggestions list, there is another such example in the suggestions:-  Mehr John Coffeys.  But why does that suggestion only merit position 69 on the list of suggestions, whereas Goofey merits position 4 in the list?

What’s even odder is what the algorithm returns at the end of the list, when a single Cornice climb located next to Fey is listed, together with a completely random climb at Horseshoe Quarry…!

If the algorithm can be tweaked so the suggestions list for a search for Fey puts the route Fey at the top; removes all the routes at Sex du Parc aux Feyes; removes the random route at Horseshoe and the unnecessary alternative Cornice route suggestion; perhaps adds the crag The Cornice; then I think the more focused, much shorter suggestions list would be far more useful to the person making the search.

Once again, I hope this is useful and wish you luck in further refining the algorithm.

Neil

In reply to Neil Foster:

Hey Neil

> However, I’d still question the usefulness of then listing 38 random climbs at Stoney Middleton, none of which is called Bitterfingers?  If you really want to pad out your list of suggestions (given that you already have a perfect match), would it not make more sense to just list the 2 crags with a route called Bitterfingers, and leave it at that?

This is because they are in a sector called 'bitterfingers bay'. It's maybe a bit confusing because the sector name is not displayed in the result.

> I did find a search which didn’t quite return the list in the order you’d expect when I searched for a route on Kalymnos named Yanap.  Again, a perfect match entry exists, but in this case there is a mountain (Yanapaccha) whose name starts with those same 5 letters, and it, rather than the perfect match, is at the top of the suggestions list.

This is because we've chosen to display crags in their own section, which always precedes the routes. I'm not married to this choice but it's how it is for now. We'll consider having a user setting that allows them to be returned in a search-score order however.

> What is perhaps more odd is that the famous route Fay at Sharpnose isn’t listed, on the same ‘potential misspelling’ basis. And try searching for Fay, for another example of much of what I’m describing here.

This is a personal preference. We're not fans of having a search engine guess what we mean, so misspellings are not accounted for.

> Anyway, back to the Fey search, looking further down the suggestions list, there is another such example in the suggestions:-  Mehr John Coffeys.  But why does that suggestion only merit position 69 on the list of suggestions, whereas Goofey merits position 4 in the list?

That's because the string "fey" is closer to "goofey" than it is to "Mehr John Coffeys".

> What’s even odder is what the algorithm returns at the end of the list, when a single Cornice climb located next to Fey is listed, together with a completely random climb at Horseshoe Quarry…!

That's due to the algo also checking the routes' descriptions and fa details, but weighting them lower than name or crag name.

> If the algorithm can be tweaked so the suggestions list for a search for Fey puts the route Fey at the top; removes all the routes at Sex du Parc aux Feyes; removes the random route at Horseshoe and the unnecessary alternative Cornice route suggestion; perhaps adds the crag The Cornice; then I think the more focused, much shorter suggestions list would be far more useful to the person making the search.

If the search didn't match on eg sector name or fa details it wouldn't be that great. The results you're seeing are because of this type of matches. Climbers can be a bit unimaginative with sector names, so they're often named after the best route in that sector, which means all that sector's routes will appear in the search. But they are weighted less, so appear down the list.

This is how the search is designed to work. It's a freeform omni search. The most relevant results are at the top, but anything that matches is below, ranked by search score. If you really want to restrict the search to route names you can use the special syntax 'r(bitterfingers)', but that to me seems more effort than just ignoring all results after the first one.


New Topic
This topic has been archived, and won't accept reply postings.
Loading Notifications...