Rewrite prefix generation and replacement logic#1539
Rewrite prefix generation and replacement logic#1539
Conversation
|
- Add splitIri, generatePrefix, commonPrefixes utility modules - Add PrefixLookup class for fast Map-based prefix lookups - Rewrite generatePrefixes to use splitIri and generatePrefix - Rewrite replacePrefixes to use PrefixLookup and splitIri - Change usePrefixes/prefixesAtom to return PrefixLookup - Remove __matches from PrefixTypeConfig - Simplify saveConfigurationToFile and useImportConnectionFile - Add backward compatibility tests for legacy __matches data
bfa1d4f to
63011a9
Compare
82efdf6 to
9dc2a68
Compare
- Reject IRIs whose local value contains characters invalid in RDF prefixed names: spaces, angle brackets, curly braces, pipes, carets, backticks, and backslashes - Allow underscores, hyphens, periods, percent-encoded sequences, Unicode letters, and middle dots - Remove duplicate tests that were outside the describe block
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1539 +/- ##
===========================================
+ Coverage 47.81% 64.90% +17.09%
===========================================
Files 382 369 -13
Lines 8525 8327 -198
Branches 3159 3101 -58
===========================================
+ Hits 4076 5405 +1329
+ Misses 3070 2075 -995
+ Partials 1379 847 -532
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| /** Abbreviates a segment to the first letter of each word. */ | ||
| function abbreviate(segment: string): string { | ||
| const words = splitSegmentIntoWords(segment); |
There was a problem hiding this comment.
I think this entire function can be summarized as:
return sanitizePrefix(
splitSegmentIntoWords(segment)
.map(w => w.charAt(0))
.join("")
)
imho, it's easier to reason about it if it's written this way, because these two returns don't look the same semantically. Makes me even question — do we not sanitizePrefix if more than one word?
|
@arseny-kostenko im running through some algorithm changes. Apparently prefixes are allowed to have hyphens, periods, and underscores. This changes how I would want to approach prefix generation. So I'm going through the changes now. Sorry for the whiplash. |
Description
splitIri,generatePrefix, andcommonPrefixesutility modulesPrefixLookupclass for fast Map-based prefix lookupsgeneratePrefixesto usesplitIriandgeneratePrefixreplacePrefixesto usePrefixLookupandsplitIriusePrefixes/prefixesAtomto returnPrefixLookup__matchesfromPrefixTypeConfigsaveConfigurationToFileanduseImportConnectionFilesoccer,soccer2,soccer3)__matchesdataValidation
splitIri,generatePrefix,PrefixLookup,commonPrefixes, prefix uniqueness, entity ID prefix generation, local value validation, and backward compatibility with legacy dataPrefix generation examples:
http://www.example.com/soccer/ontology/Leaguesoccerhttp://www.schema.org/Cityschemaresourcesegmenthttp://data.nobelprize.org/resource/country/Francecountryclasssegmenthttps://dbpedia.org/class/yago/Record106647206yagohttp://example.org/2024/01/schema#Thingschemahttp://kelvinlawrence.net/air-routes/datatypeProperty/nameairdphttp://kelvinlawrence.net/air-routes/objectProperty/routeairophttp://example.org/my-special_ns.v2/Itemmyhttp://www.example.com/soccer/ontology/Leaguesoccerhttp://www.example.com/soccer/resource#EPLsoccer2http://www.example.com/soccer/class#Teamsoccer3Local value validation (accepted):
my_item,my-item,v2.0,caf%C3%A9,café,item·1Local value validation (rejected → no prefix generated):
my item,a<b,a{b},a|b,a^b,a`b,a\bRelated Issues
Check List
license.
pnpm checksto ensure code compiles and meets standards.pnpm testto check if all tests are passing.Changelog.md.