fix(search): resolve issue when using `=` operator in search #7268

perfectra1n · 2025-10-10T19:01:05Z

make sure that =text returns matches that only include "text"
create additional e2e and unit tests to validate change

Phrase example:

Word example (with working highlighting!)

gemini-code-assist · 2025-10-10T19:01:22Z

Summary of Changes

Hello @perfectra1n, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a long-standing issue in the search mechanism where the = operator failed to provide true exact word matches, often returning partial or substring results. The changes introduce a precise word-boundary matching algorithm, enhancing the accuracy and reliability of exact searches. This ensures that users can confidently use the = operator to find specific terms without irrelevant noise, backed by a robust suite of new and updated tests.

Highlights

Exact Search Fix: The core search functionality has been updated to correctly handle the = operator for exact word matching, ensuring that queries like =text only return results that contain 'text' as a whole word, rather than partial matches.
New containsExactWord Method: A new private utility method, containsExactWord, was introduced in note_content_fulltext.ts to perform case-insensitive, word-boundary-aware matching, which is now leveraged by the = and != search operators.
Updated Comparator Logic: The build_comparator.ts service now uses the new exact word matching logic for the = and != string comparators, ensuring consistent behavior across different search contexts.
Comprehensive Test Coverage: Extensive new E2E tests (exact_search.spec.ts) and expanded unit tests (search.spec.ts) have been added to validate the exact search functionality, covering scenarios such as partial matches, exact matches, case insensitivity, and handling of special characters.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request aims to fix the exact search operator =. The changes correctly implement exact word matching for single words. However, there's a critical issue with how quoted phrases (multi-word tokens) are handled in exact searches. The current implementation checks if any single word in the content matches the entire phrase, which will almost always fail. It should instead verify that all words from the phrase are present as distinct words in the content. This issue is present in both note_content_fulltext.ts and build_comparator.ts. Additionally, this logic is duplicated in both files and should be centralized. The new e2e and unit tests are very thorough and a great addition, though one e2e test has a weak assertion that should be strengthened.

gemini-code-assist · 2025-10-10T19:03:59Z

apps/server/src/services/search/expressions/note_content_fulltext.ts

+    /**
+     * Checks if content contains the exact word (with word boundaries)
+     * This is case-insensitive since content and token are already normalized
+     */
+    private containsExactWord(token: string, content: string): boolean {
+        // Normalize both for case-insensitive comparison
+        const normalizedToken = normalizeSearchText(token);
+        const normalizedContent = normalizeSearchText(content);
+
+        // Split content into words and check for exact match
+        const words = normalizedContent.split(/\s+/);
+        return words.some(word => word === normalizedToken);
+    }


The containsExactWord method is implemented incorrectly for multi-word tokens (e.g., from a quoted search like ="exact phrase"). It checks if any single word in the content is equal to the entire token phrase, which will almost always be false. The correct logic should be to split the token into words and verify that all of those words exist in the content.

Additionally, the comment on line 94 is misleading as the function performs normalization. The normalization of content is also redundant since it's already normalized before this function is called.

The logic here is also duplicated in apps/server/src/services/search/services/build_comparator.ts. This should ideally be centralized in a utility function.

/** * Checks if content contains all words from the given token. * This is case-insensitive. Assumes content is already normalized. */ private containsExactWord(token: string, content: string): boolean { // content is assumed to be pre-normalized. const contentWords = new Set(content.split(/\s+/)); const tokenWords = normalizeSearchText(token).split(/\s+/); return tokenWords.every(word => contentWords.has(word)); }

gemini-code-assist · 2025-10-10T19:03:59Z

apps/server/src/services/search/services/build_comparator.ts

+    "=": (comparedValue) => (val) => {
+        // For the = operator, check if the value contains the exact word (word-boundary matching)
+        // This is case-insensitive since both values are already lowercased
+        if (!val) return false;
+
+        const normalizedVal = normalizeSearchText(val);
+        const normalizedCompared = normalizeSearchText(comparedValue);
+
+        // Split into words and check for exact match
+        const words = normalizedVal.split(/\s+/);
+        return words.some(word => word === normalizedCompared);
+    },
+    "!=": (comparedValue) => (val) => {
+        // Negation of exact word match
+        if (!val) return true;
+
+        const normalizedVal = normalizeSearchText(val);
+        const normalizedCompared = normalizeSearchText(comparedValue);
+
+        const words = normalizedVal.split(/\s+/);
+        return !words.some(word => word === normalizedCompared);
+    },


The comparators for = and != are implemented incorrectly for multi-word values (e.g., from a quoted attribute search like #title="foo bar"). The current logic for = checks if any single word in the attribute value is equal to the entire search phrase, which is incorrect. It should check if all words from the search phrase are present in the attribute value. The != comparator has the inverse issue.

The comment on line 18 is also misleading as normalization is performed within the function. This logic is also duplicated from note_content_fulltext.ts and would benefit from being a shared utility function.

"=": (comparedValue) => (val) => { // For the = operator, check if the value contains all words from the comparedValue. // This is case-insensitive. if (!val) return false; const valWords = new Set(normalizeSearchText(val).split(/\s+/)); const comparedWords = normalizeSearchText(comparedValue).split(/\s+/); return comparedWords.every(cw => valWords.has(cw)); }, "!=": (comparedValue) => (val) => { // Negation of exact word match: check if value does NOT contain all words from comparedValue. if (!val) return true; const valWords = new Set(normalizeSearchText(val).split(/\s+/)); const comparedWords = normalizeSearchText(comparedValue).split(/\s+/); return !comparedWords.every(cw => valWords.has(cw)); },

gemini-code-assist · 2025-10-10T19:03:59Z

apps/server-e2e/src/exact_search.spec.ts

+    test.beforeEach(async ({ page, context }) => {
+        const app = new App(page, context);
+        await app.goto();
+
+        // Get CSRF token
+        csrfToken = await page.evaluate(() => {
+            return (window as any).glob.csrfToken;
+        });
+
+        expect(csrfToken).toBeTruthy();
+
+        // Create test notes with specific content patterns
+        // Note 1: Contains exactly "pagio" in title
+        const note1 = await page.request.post(`${BASE_URL}/api/notes/root/children?target=into&targetBranchId=`, {
+            headers: { "x-csrf-token": csrfToken },
+            data: {
+                title: "Test Note with pagio",
+                content: "This note contains the word pagio in the content.",
+                type: "text"
+            }
+        });
+        expect(note1.ok()).toBeTruthy();
+        const note1Data = await note1.json();
+        createdNoteIds.push(note1Data.note.noteId);
+
+        // Note 2: Contains "page" (not exact match)
+        const note2 = await page.request.post(`${BASE_URL}/api/notes/root/children?target=into&targetBranchId=`, {
+            headers: { "x-csrf-token": csrfToken },
+            data: {
+                title: "Test Note with page",
+                content: "This note contains the word page in the content.",
+                type: "text"
+            }
+        });
+        expect(note2.ok()).toBeTruthy();
+        const note2Data = await note2.json();
+        createdNoteIds.push(note2Data.note.noteId);
+
+        // Note 3: Contains "pages" (plural, not exact match)
+        const note3 = await page.request.post(`${BASE_URL}/api/notes/root/children?target=into&targetBranchId=`, {
+            headers: { "x-csrf-token": csrfToken },
+            data: {
+                title: "Test Note with pages",
+                content: "This note contains the word pages in the content.",
+                type: "text"
+            }
+        });
+        expect(note3.ok()).toBeTruthy();
+        const note3Data = await note3.json();
+        createdNoteIds.push(note3Data.note.noteId);
+
+        // Note 4: Contains "homepage" (contains "page", not exact match)
+        const note4 = await page.request.post(`${BASE_URL}/api/notes/root/children?target=into&targetBranchId=`, {
+            headers: { "x-csrf-token": csrfToken },
+            data: {
+                title: "Homepage Note",
+                content: "This note is about homepage content.",
+                type: "text"
+            }
+        });
+        expect(note4.ok()).toBeTruthy();
+        const note4Data = await note4.json();
+        createdNoteIds.push(note4Data.note.noteId);
+
+        // Note 5: Another note with exact "pagio" in content
+        const note5 = await page.request.post(`${BASE_URL}/api/notes/root/children?target=into&targetBranchId=`, {
+            headers: { "x-csrf-token": csrfToken },
+            data: {
+                title: "Another pagio Note",
+                content: "This is another note with pagio content for testing exact matches.",
+                type: "text"
+            }
+        });
+        expect(note5.ok()).toBeTruthy();
+        const note5Data = await note5.json();
+        createdNoteIds.push(note5Data.note.noteId);
+
+        // Note 6: Contains "pagio" in title only
+        const note6 = await page.request.post(`${BASE_URL}/api/notes/root/children?target=into&targetBranchId=`, {
+            headers: { "x-csrf-token": csrfToken },
+            data: {
+                title: "pagio",
+                content: "This note has pagio as the title.",
+                type: "text"
+            }
+        });
+        expect(note6.ok()).toBeTruthy();
+        const note6Data = await note6.json();
+        createdNoteIds.push(note6Data.note.noteId);
+
+        // Wait a bit for indexing
+        await page.waitForTimeout(500);
+    });
+
+    test.afterEach(async ({ page }) => {
+        // Clean up created notes
+        for (const noteId of createdNoteIds) {
+            try {
+                const taskId = `cleanup-${Math.random().toString(36).substr(2, 9)}`;
+                await page.request.delete(`${BASE_URL}/api/notes/${noteId}?taskId=${taskId}&last=true`, {
+                    headers: { "x-csrf-token": csrfToken }
+                });
+            } catch (e) {
+                console.error(`Failed to delete note ${noteId}:`, e);
+            }
+        }
+        createdNoteIds = [];
+    });
+
+    test("Quick search without = operator returns all partial matches", async ({ page }) => {
+        // Test the /quick-search endpoint without the = operator
+        const response = await page.request.get(`${BASE_URL}/api/quick-search/pag`, {
+            headers: { "x-csrf-token": csrfToken }
+        });
+
+        expect(response.ok()).toBeTruthy();
+        const data = await response.json();
+
+        // Should return multiple notes including "page", "pages", "homepage"
+        expect(data.searchResultNoteIds).toBeDefined();
+        expect(data.searchResults).toBeDefined();
+
+        // Filter to only our test notes
+        const testResults = data.searchResults.filter((result: any) =>
+            result.noteTitle.includes("page") ||
+            result.noteTitle.includes("pagio") ||
+            result.noteTitle.includes("Homepage")
+        );
+
+        // Should find at least "page", "pages", "homepage", and "pagio" notes
+        expect(testResults.length).toBeGreaterThanOrEqual(4);
+
+        console.log("Quick search 'pag' found:", testResults.length, "matching notes");
+        console.log("Note titles:", testResults.map((r: any) => r.noteTitle));
+    });
+
+    test("Quick search with = operator returns only exact matches", async ({ page }) => {
+        // Test the /quick-search endpoint WITH the = operator
+        const response = await page.request.get(`${BASE_URL}/api/quick-search/=pagio`, {
+            headers: { "x-csrf-token": csrfToken }
+        });
+
+        expect(response.ok()).toBeTruthy();
+        const data = await response.json();
+
+        // Should return only notes with exact "pagio" match
+        expect(data.searchResultNoteIds).toBeDefined();
+        expect(data.searchResults).toBeDefined();
+
+        // Filter to only our test notes
+        const testResults = data.searchResults.filter((result: any) =>
+            createdNoteIds.includes(result.notePath.split("/").pop() || "")
+        );
+
+        console.log("Quick search '=pagio' found:", testResults.length, "matching notes");
+        console.log("Note titles:", testResults.map((r: any) => r.noteTitle));


The test file has some opportunities for improvement in terms of maintainability and type safety:

Code Duplication: The beforeEach block (lines 18-110) has a lot of duplicated code for creating notes. This could be refactored into a helper function to make the setup cleaner and easier to maintain.

Type Safety: The filter callbacks use any for the result parameter (e.g., line 141, 168). Defining an interface for the search result object would improve type safety.

Debugging Logs: There are several console.log statements (e.g., lines 150-151, 172-173) which are useful for debugging but should be removed from the final test code to keep the test output clean.

gemini-code-assist · 2025-10-10T19:04:00Z

apps/server-e2e/src/exact_search.spec.ts

+
+        // Should find the note with exact "test" match, but not "testing"
+        // Note: This test may fail if the implementation doesn't properly handle exact matching in content
+        expect(ourTestNotes.length).toBeGreaterThan(0);


The assertion expect(ourTestNotes.length).toBeGreaterThan(0); is too lenient for this test case. To properly verify that an exact search for =test does not match "testing", the assertion should be more specific and check that exactly one note is found.

Suggested change

expect(ourTestNotes.length).toBeGreaterThan(0);

expect(ourTestNotes.length).toBe(1);

contributor · 2025-10-21T09:06:49Z

@perfectra1n search with = operator only shows one note for me in this PR. But actually there are multiple notes in db.

perfectra1n · 2025-10-21T16:28:19Z

@contributor can you provide more details? Was the search term ='test' or was it ='test phrase'? How many notes would you expect for it to have found, were they all text notes?

contributor · 2025-10-21T18:06:00Z

@perfectra1n

Commit 4fa4112

The search term =pagio and ='pagio' returns the same (incorrect) result - only one note.
I have 45 notes with word pagio.

The search term pagio return 114 notes - this includes false positives due to fuzzy search (the first 45 of them are correct notes).

fix(search): resolve issue when using = operator in search

50f0b88

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Oct 10, 2025

gemini-code-assist bot reviewed Oct 10, 2025

View reviewed changes

feat(search): also support the use of ="exact match search string"

4fa4112

fix(search): also support exact phrase matching such as ='test phrase'

fb0d971

perfectra1n added 2 commits October 21, 2025 11:30

Merge branch 'main' into fix/fix-equals-operator-in-search

b03cb1c

fix(search): make sure to highlight exact search results too

8e227a6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

fix(search): resolve issue when using `=` operator in search #7268

fix(search): resolve issue when using `=` operator in search #7268

Uh oh!

perfectra1n commented Oct 10, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Oct 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 10, 2025

Uh oh!

gemini-code-assist bot Oct 10, 2025

Uh oh!

gemini-code-assist bot Oct 10, 2025

Uh oh!

gemini-code-assist bot Oct 10, 2025

Uh oh!

contributor commented Oct 21, 2025 •

edited

Loading

Uh oh!

perfectra1n commented Oct 21, 2025

Uh oh!

contributor commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	expect(ourTestNotes.length).toBeGreaterThan(0);
	expect(ourTestNotes.length).toBe(1);

Uh oh!

Uh oh!

fix(search): resolve issue when using = operator in search #7268

Are you sure you want to change the base?

fix(search): resolve issue when using = operator in search #7268

Uh oh!

Conversation

perfectra1n commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Oct 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

contributor commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

perfectra1n commented Oct 21, 2025

Uh oh!

contributor commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(search): resolve issue when using `=` operator in search #7268

fix(search): resolve issue when using `=` operator in search #7268

perfectra1n commented Oct 10, 2025 •

edited

Loading

contributor commented Oct 21, 2025 •

edited

Loading