Skip to content

Bug: getCanonicalPageId does not support non-latin page titles #422

Closed
@marharyta

Description

@marharyta

Description

I, unfortunately, failed to create a PR, the repo seems to require permission to push a new branch for me. But nevertheless, here is the problem description and my proposed solution:
Screenshot 2023-01-21 at 18 40 14

Problem: getCanonicalPageId does not support non-latin page titles

Issue:

I am using [Notion.so](http://Notion.so) to run [FinUA.org](http://FinUA.org) website and currently it isb deployed with [super.so](http://super.so). I have been using nextjs-notion-starter-kit project for it (thank you).

As deployed the project to Vercel, I realized that there were quite a few browser warnings about the page due to generated page URLs (they looked broken).
Screenshot 2023-01-21 at 18 41 25

the page behind it:
Screenshot 2023-01-21 at 18 41 37

moreover, this page also had the same URL generated /- despite being a separate page, and clicking on it would lead to the first page.

Screenshot 2023-01-21 at 18 42 13

I have investigated it, and it seems that the problem was in the module https://github.com/transitive-bullshit/nextjs-notion-starter-kit/blob/main/lib/get-canonical-page-id.ts

import { ExtendedRecordMap } from 'notion-types'
import {
  getCanonicalPageId as getCanonicalPageIdImpl,
  parsePageId
} from 'notion-utils'

import { inversePageUrlOverrides } from './config'

export function getCanonicalPageId(
  pageId: string,
  recordMap: ExtendedRecordMap,
  { uuid = true }: { uuid?: boolean } = {}
): string | null {
  const cleanPageId = parsePageId(pageId, { uuid: false })
  if (!cleanPageId) {
    return null
  }

  const override = inversePageUrlOverrides[cleanPageId]
  if (override) {
    return override
  } else {
		// PROBLEM: this line seemed to be the issue
    return getCanonicalPageIdImpl(pageId, recordMap, {
      uuid
    })
  }
}

I went to the module https://github.com/NotionX/react-notion-x/tree/master/packages/notion-utils

and copied https://github.com/NotionX/react-notion-x/blob/master/packages/notion-utils/src/get-canonical-page-id.ts module, the problem seemed to be getCanonicalPageId function, it only seemed to work for Latin symbols normalizeTitle(getBlockTitle(block, recordMap)):

I pulled the normalizeTitle function, and yes, it seems to be the case

function normalizeTitle(title) {
  return (title || '')
    .replace(/ /g, '-')
    .replace(/[^a-zA-Z0-9-\u4e00-\u9fa5]/g, '')
    .replace(/--/g, '-')
    .replace(/-$/, '')
    .replace(/^-/, '')
    .trim()
    .toLowerCase()
}

const eng = normalizeTitle('Naapurin Maalaiskana (NMK), in Lieto, in Turku area');
const ukr = normalizeTitle('Робота помічника з обслуговування контейнерів');
const ukr1 = normalizeTitle('Ищем литейщиков в Карккила, Финляндия, для обработки изделий в металлургической промышленности');
console.log('test', eng, ukr, ukr1)

// "test"
// "naapurin-maalaiskana-nmk-in-lieto-in-turku-area"
// ""
// "---"

Solution:

The one that worked for me was just replacing normalizeTitle(getBlockTitle(block, recordMap)) with slugify from the transliteration npm package.

Notion Test Page ID

701245d6db8c413689d180e87269ee56

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions