首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >什么是大写单词和符号混合的最佳判据?

什么是大写单词和符号混合的最佳判据?
EN

Stack Overflow用户
提问于 2022-04-25 06:36:10
回答 2查看 46关注 0票数 2

我正在整理以下列表:游戏体裁列表

我想分开连接的单词,但似乎它们不会正确地使用大写字母表示缩略词(例如PVP、MMORPG、MOBA、DeFi)。

目前,我的regex代码如下:

Re.sub(r“(\w)(A)”,r“1 \2",ele)用于genre_list中的ele

正如您在下面看到的,它有时起作用,有时不起作用:

“收藏品开放-世界虚拟世界”、“繁殖卡PV P”、“汽车-战斗者育种策略”、“小型游戏开放-世界虚拟-世界”、“行动模拟体育”、“冒险MM OStrategy”、“冒险偶发难题”、“体育”、“收藏Sci-虚拟世界”、“战斗-Royalee体育运动MO”、“Action PV PShooter”、“P VP Sci-Tower-Fi-Defense”、“Action Card Royale”、“P VP Sci-Fi -Fi、‘育种收藏品挖掘’、“收藏体育”、“行动冒险射击”、“城市-建立收藏品仿真”、“行动战略”、“冒险开放-世界”、“培育竞赛运动”、“开放-世界虚拟世界”、“收藏品Idle”、“行动冒险”、“卡片收藏PV P”、“战斗-Royale Fantasy MO BA”、“城市-建筑”、“建筑MM OStrategy”、“冒险MM或PG”、“行动冒险Idle”、“M OB AR PG策略”、“M MO RP GStrategy”、“卡片收藏闲置”、“开放世界PV PR PG”、“De OSpace”、“收藏”、“卡片收藏PV P”、“Auto De Fi RP G”、“冒险MM OOpen-World”、“收藏开放-世界虚拟世界”、“收藏Idle RP G”、“卡片收藏PV P”、“动作冒险PV P”、“Sci- Fi Shooter生存”、“行动策略”、“Arcade迷你游戏”、“育种PV PV”、“MM”、‘动作体育’,'P VP空间转弯‘,'M MO战略塔-防御’

你能帮我看看哪个雷吉在这方面做得最好吗?还是regex对这个列表不起作用?谢谢!

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-04-26 13:34:05

这是困难的,因为你有ALLCAPS的词,可能会被粘合。如果你有这样一个清单,它是可解的。

下面是您可以使用和增强的代码,以获得更好的输出精度:

代码语言:javascript
复制
import re
l = ['Collectible Open-World Virtual-World', 'Breeding Card PV P', 'Auto-Battler Breeding Strategy', 'Minigame Open-World Virtual-World', 'Action Simulation Sports', 'Adventure MM OStrategy', 'Adventure Casual Puzzle', 'Sports', 'Collectible Sci-Fi Virtual-World', 'Battle-Royalee Sports MO BA', 'Action PV PShooter', 'P VP Sci-Fi Tower-Defense', 'Action Battle-Royale', 'P VP Sci-Fi Shooter', 'Breeding Collectible Mining', 'Collectible De Fie Sports', 'Action Adventure Shooter', 'City-Building Collectible Simulation', 'Action Strategy', 'Adventure Open-World', 'Breeding Racing Sports', 'Open-World Virtual-World', 'Collectible Idle', 'Action Adventure', 'Card Collectible PV P', 'Battle-Royale Fantasy MO BA', 'City-Building', 'Building MM OStrategy', 'Adventure MM OR PG', 'Action Adventure Idle', 'M OB AR PG Strategy', 'M MO RP GStrategy', 'Card Collectible Idle', 'Open-World PV PR PG', 'De Fi MM OSpace', 'Collectible', 'Card Collectible PV P', 'Auto-Battler De Fi RP G', 'Adventure MM OOpen-World', 'Collectible Open-World Virtual-World', 'Collectible Idle RP G', 'Card Collectible PV P', 'Action Adventure PV P', 'Sci-Fi Shooter Survival', 'Action Strategy', 'Arcade Minigame', 'Breeding PV PRacing', 'M OB AP VP', 'Action Sports', 'P VP Space Turn-based', 'M MO Strategy Tower-Defense']
l = [''.join(s.split()) for s in l]
allcaps = ['RPG', 'MOBA', 'PVP', 'MMO']
rx_1 = re.compile(r'[a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z])')
rx_2 = re.compile( fr"\b(?:{r'|'.join(allcaps)})(?=[A-Za-z])" )
rx_3 = re.compile( fr"(?<=[A-Za-z])(?:{r'|'.join(allcaps)})\b" )
for s in l:
    print( r'{} => {}'.format(s, rx_3.sub(r" \g<0>", rx_2.sub(r"\g<0> ", rx_1.sub(r"\g<0> ", s)))) )

Python演示。输出:

代码语言:javascript
复制
CollectibleOpen-WorldVirtual-World => Collectible Open-World Virtual-World
BreedingCardPVP => Breeding Card PVP
Auto-BattlerBreedingStrategy => Auto-Battler Breeding Strategy
MinigameOpen-WorldVirtual-World => Minigame Open-World Virtual-World
ActionSimulationSports => Action Simulation Sports
AdventureMMOStrategy => Adventure MMO Strategy
AdventureCasualPuzzle => Adventure Casual Puzzle
Sports => Sports
CollectibleSci-FiVirtual-World => Collectible Sci-Fi Virtual-World
Battle-RoyaleeSportsMOBA => Battle-Royalee Sports MOBA
ActionPVPShooter => Action PVP Shooter
PVPSci-FiTower-Defense => PVP Sci-Fi Tower-Defense
ActionBattle-Royale => Action Battle-Royale
PVPSci-FiShooter => PVP Sci-Fi Shooter
BreedingCollectibleMining => Breeding Collectible Mining
CollectibleDeFieSports => Collectible De Fie Sports
ActionAdventureShooter => Action Adventure Shooter
City-BuildingCollectibleSimulation => City-Building Collectible Simulation
ActionStrategy => Action Strategy
AdventureOpen-World => Adventure Open-World
BreedingRacingSports => Breeding Racing Sports
Open-WorldVirtual-World => Open-World Virtual-World
CollectibleIdle => Collectible Idle
ActionAdventure => Action Adventure
CardCollectiblePVP => Card Collectible PVP
Battle-RoyaleFantasyMOBA => Battle-Royale Fantasy MOBA
City-Building => City-Building
BuildingMMOStrategy => Building MMO Strategy
AdventureMMORPG => Adventure MMO RPG
ActionAdventureIdle => Action Adventure Idle
MOBARPGStrategy => MOBA RPG Strategy
MMORPGStrategy => MMO RPG Strategy
CardCollectibleIdle => Card Collectible Idle
Open-WorldPVPRPG => Open-World PVP RPG
DeFiMMOSpace => De Fi MMO Space
Collectible => Collectible
CardCollectiblePVP => Card Collectible PVP
Auto-BattlerDeFiRPG => Auto-Battler De Fi RPG
AdventureMMOOpen-World => Adventure MMO Open-World
CollectibleOpen-WorldVirtual-World => Collectible Open-World Virtual-World
CollectibleIdleRPG => Collectible Idle RPG
CardCollectiblePVP => Card Collectible PVP
ActionAdventurePVP => Action Adventure PVP
Sci-FiShooterSurvival => Sci-Fi Shooter Survival
ActionStrategy => Action Strategy
ArcadeMinigame => Arcade Minigame
BreedingPVPRacing => Breeding PVP Racing
MOBAPVP => MOBA PVP
ActionSports => Action Sports
PVPSpaceTurn-based => PVP Space Turn-based
MMOStrategyTower-Defense => MMO Strategy Tower-Defense

[a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z])正则表达式(参见其演示)匹配

  • [a-z](?=[A-Z]) -一个小写字母,紧跟大写字母
  • | -或
  • [A-Z](?=[A-Z][a-z]) -大写字母后面跟着大写字母和小写字母。

我们在这些比赛之后加了一个空格。

rx_2rx_3正则表达式是从ALLCAPS单词列表中构建的,并在左侧或右侧添加一个空格,这取决于另一个字母出现的侧边。

票数 0
EN

Stack Overflow用户

发布于 2022-04-25 06:48:19

基于注释的编辑:您只需要在A,即r"(\w)([A-Z]+)"之后添加一个'+‘。这将匹配一个或多个大写字母。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71995355

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档