PDA

View Full Version : Language filter - list of foul language?


Indiepath
07-22-2005, 03:20 AM
Does anyone have a list of words containing bad language. Such a list which could be used to censor obscenities from received data. (Like if player name is "I hate (f-word) Bush!?!" it could be censored...)

Thanks in advance.

P.S. I don't want you to tell me all the bad language you know. I think I don't want to start a thread where you people would start throwing obscenities... Just a ready made list please :)

mahlzeit
07-22-2005, 03:50 AM
I know George Carlin maintains an incomplete list of impolite words. :)

PeterM
07-22-2005, 05:38 AM
i'd say you'd need to support wildcards too, like "any symbol" can match to "any letter".

A lot of people like using ! for i, @ for a, and so on, so you might want to have a generic way of dealing with that.

Sharkbait
07-22-2005, 06:06 AM
You might want to be careful with wildcards, you don't want to end up censoring 'shot' and 'funk' just because they get filtered out by 'sh?t' and 'fu?k'.

Also, country differences can be an issue. A perfectly acceptable English word in one region/country may have offensive connotations in another.

papillon
07-22-2005, 06:13 AM
If you're worried about public high score lists there's always going to be a chance for someone to get around you - I once had a character called "Blowsgoats" in an RPG because I was annoyed at its swear filter. So you may still need to keep an eye on things, people can arrange legitimate words in stupid ways... :)

Anthony Flack
07-22-2005, 06:40 AM
You might want to be careful with wildcards, you don't want to end up censoring 'shot' and 'funk' just because they get filtered out by 'sh?t' and 'fu?k'.


You could inplement a more intelligent one... like "shi[i/!]t". A really clever system might be able to catch stuff like F.U.C.K. as well. And an extraordinarily clever one might be able to allow certain exceptions like "Scunthorpe"

Actually, if someone were to make a dll or similar that can parse and censor strings in an intelligent and thorough manner, it might be quite a useful thing for everybody.

mahlzeit
07-22-2005, 06:59 AM
Of course, you could google for "swear filter source code". It's not like you're the first person ever to write something like this. :)

chanon
07-22-2005, 08:53 AM
I'd also like a list of swear words. I haven't been able to find one through google. Lots of people saying they want a list of swear words but no list file!

mahlzeit
07-22-2005, 09:03 AM
Here's the George Carlin list (http://www.textfiles.com/drugs/bbrosn13.txt) (and 2 (http://www.textfiles.com/drugs/bbrosn14.txt)) I mentioned earlier. You can also find it in audio if you search a little. Have fun!

Raptisoft
07-22-2005, 09:23 AM
"Goober Grabber?"

soniCron
07-22-2005, 12:33 PM
Here's the George Carlin list (http://www.textfiles.com/drugs/bbrosn13.txt) (and 2 (http://www.textfiles.com/drugs/bbrosn14.txt)) I mentioned earlier. You can also find it in audio if you search a little. Have fun! Hah! :D You weren't kidding!

chanon
07-23-2005, 02:30 AM
Here's the George Carlin list (http://www.textfiles.com/drugs/bbrosn13.txt) (and 2 (http://www.textfiles.com/drugs/bbrosn14.txt)) I mentioned earlier. You can also find it in audio if you search a little. Have fun!

Eventhough it looks like I still have to go through it by hand but .. Thanks!

DavidRM
07-23-2005, 08:49 PM
Here's a list of simple regular expressions that catch quite a few of the more common vulgarities.


f[uv+*-_!]ck
f.[uv+*-_!]ck
f[uv+*-_!][kq]
f.[uv+*-_!].c.k
ph[uv+*-_!]ck
ph[uv+*-_!][kq]
[s$5]h[il1|+*-_!]t
[s$5].h.[il1|+*-_!].t
d[a@4+*-_!]mn
b[il1|+*-_!]tch
b.[il1|+*-_!].t.c.h
b[a@4][s$5]t[a@4]rd
b.[a@4].[s$5].t.[a@4].r.d
c[uv+*-_!]nt,c[li1|][il1|+*-_!]t
cornh[o0][li1|]
b[uv+*-_!]ngh[o0][li1|]
[s$5][li1|]ut
v[a@4]gin[a@4]
p[e3]n[il1|][s$5]
wh[o0]re
[a@4][s$5][s$5]h[o0][li1|]
[a@4][s$5][s$5]w[il1|]p
d[uv+*-_!]mb[a@4+*-_!][s$5][s$5]
@[s$5][s$5]
j[e3][s$5][uv+*-_!][s$5] chri[s$5]t
j[e3][s$5][uv+*-_!][s$5]
b[li1|][o0]wj[o0]b
n[il1|+*-_!]gg[e3]r
n[il1|+*-_!]gg[s$5]


Hope that helps.

-David

soniCron
07-24-2005, 08:04 AM
Here's a list of simple regular expressions that catch quite a few of the more common vulgarities... That's excellent! I've really got to learn regular expressions. I've been putting it off for so long now... ;)

cliffski
07-24-2005, 08:23 AM
Also, country differences can be an issue. A perfectly acceptable English word in one region/country may have offensive connotations in another.

You betcha. Us Brits always do a double take when an american describes someone as 'full of spunk' for example.

Jim Buck
07-24-2005, 10:14 AM
Many of us Americans double-take on that phrase too. :)

Applewood
07-26-2005, 05:05 AM
I was once thrown off of westwood online for an entire week for the heinous crime of writing "snigger" in response to a smutty comment.

When I mailed them to complain and offered to spend a weekend writing them a competent parser, their response was even more bizzarre - that I should stop spreading further racial hatred before I get banned for life.

Bunch of 4$$h0l3z

digriz
07-26-2005, 05:46 AM
Here's a list of simple regular expressions that catch quite a few of the more common vulgarities.


f[uv+*-_!]ck
f.[uv+*-_!]ck
f[uv+*-_!][kq]
f.[uv+*-_!].c.k
ph[uv+*-_!]ck
ph[uv+*-_!][kq]
[s$5]h[il1|+*-_!]t
[s$5].h.[il1|+*-_!].t
d[a@4+*-_!]mn
b[il1|+*-_!]tch
b.[il1|+*-_!].t.c.h
b[a@4][s$5]t[a@4]rd
b.[a@4].[s$5].t.[a@4].r.d
c[uv+*-_!]nt,c[li1|][il1|+*-_!]t
cornh[o0][li1|]
b[uv+*-_!]ngh[o0][li1|]
[s$5][li1|]ut
v[a@4]gin[a@4]
p[e3]n[il1|][s$5]
wh[o0]re
[a@4][s$5][s$5]h[o0][li1|]
[a@4][s$5][s$5]w[il1|]p
d[uv+*-_!]mb[a@4+*-_!][s$5][s$5]
@[s$5][s$5]
j[e3][s$5][uv+*-_!][s$5] chri[s$5]t
j[e3][s$5][uv+*-_!][s$5]
b[li1|][o0]wj[o0]b
n[il1|+*-_!]gg[e3]r
n[il1|+*-_!]gg[s$5]


Hope that helps.

-David

I wonder if thinkgeek will accept t-shirt ideas. Geeky swearing....Fantastic.

I think Cas & I have discussed the use of a book called Rogers Profanisaurus (http://www.amazon.com/exec/obidos/tg/detail/-/0752215078/qid=1122381849/sr=8-1/ref=pd_bbs_1/002-8645007-3360034?v=glance&s=books&n=507846) ...It might be of interest to you....it's also very very funny.

Indiepath
07-26-2005, 08:22 AM
Hmm... thanks people. Reg exp could do some... and then custom word list... and I definitely will buy that book ;)

Indiepath.T
07-26-2005, 08:57 AM
Get coding that system then so I can release this software :D