Wednesday March 18th, 2015 Terry Riegel
middleUTF()
UTF and HTML/OS
I decided to roll my own UTF-8 string compatability.
So if you have a string of text that you know is UTF-8 encoded you can get its length and also parse it just like regular text.
Step 1. | Step 2. | Step 3. |
---|---|---|
Prepare: temp=asUTF(text) | Perform string manipulations using the alternative tags length_(temp), middle_(temp) etc (notice the underscore)... | Convert back to string text=asText(temp) |
For Example this page...
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>UTF8 Demo</title>
</head>
<body>
<<
msg="Everybody is using emoticons (😊😋😨😆) These days"
display "msg="+msg+'<br>' /display
display "Length(msg)="+length(msg)+'<br>' /display
msg2=asUTF(msg)
display "Length_(msg2)="+length_(msg2)+'<br>' /display
msg_middle=middle_(msg2,31,31)
display '31st character is "'+asText(msg_middle)+'"' /display
>>
</body>
</html>
Would display this output...
msg=Everybody is using emoticons (😊😋😨😆) These days
Length(msg)=58
Length_(msg2)=46
31st character is "😊"
Here are the Functions
function asUTF(text) locals x,s,st,i,a do
x=1
st='ERROR'
while x<length(text)+1 do
i=getascii(middle(text,x,x))
if i<192 then s=1
elif i<224 then s=2
elif i<240 then s=3
elif i<248 then s=4
elif i<252 then s=5
elif i<254 then s=6
/if
a='ERROR'
a=middle(text,x,x+(s-1))
if st='ERROR' then st=a else st=merge(st,a) /if
x=x+s
/while
return st /return
/function
function middle_(text_UTF,s,e) do
return gettable(text_UTF,s,e,1,1) /return
/function
function length_(text_UTF) do
return cols(text_UTF) /return
/function
function left_(text_UTF,i) do
return gettable(text_UTF,1,i,1,1) /return
/function
function right_(text_UTF,i) locals l do
l=cols(text_utf)
return gettable(text_UTF,l-i+1,l,1,1) /return
/function
function reverse_(text_UTF,i) do
return reversecols(text_UTF) /return
/function
function concat_(text1_UTF,text2_UTF) do
return merge(text1_UTF,text2_UTF) /return
/function
function asText(text_UTF) locals text,x do
text=''
for name=x value=1 to cols(text_UTF) do
text=text+text_UTF[x,1]
/for
return text /return
/function