AltME: Parse

Messages

Wednesday 5th March, 2014

PeterWood: 09:43A web-public Parse group.
GrahamC: 18:34And Peter's first message is redacted in the web mirror!
Bo: 18:38As a show of support, this is my first post to any Parse group, not just the web-public one. :-)
My favorite use of parse: parse some-str { }
Breaks apart any string into individual words.
sqlab: 20:29this looks like Rebol3, Rebol2 uses parse some-str none, but in both cases it breaks a string in strings, not in words
Andreas: 20:44Works in both, R2 and R3.

20:45But for this particular use, Rebol 3 now has a _much_ better tool: SPLIT.
Bo: 20:47I learned something already. :-)

Thursday 6th March, 2014

sqlab: 08:47it works in R2. as even
>> parse "a b c" {,}
== ["a" "b" "c"]
breaks at white spaces
Gabriele: 09:48don't forget that PARSE str "," (or any other delimiter), in R2 at least, is meant to parse CSV lines, so it has some built in magic that may surprise you if you're not aware of it.

09:48>> parse {a,b,c} ","
== ["a" "b" "c"]
>> parse {a,b,c d} ","
== ["a" "b" "c" "d"]
>> parse {a,b,"c d"} ","
== ["a" "b" "c d"]
>> parse {a,b,"c d,e"} ","
== ["a" "b" "c d,e"]

09:49note also:
>> parse/all {a,b,c d} ","
== ["a" "b" "c d"]
>> parse/all {a,b,c,d} ","
== ["a" "b" "c" "d"]
>> parse/all {a,b,"c,d"} ","
== ["a" "b" "c,d"]
sqlab: 09:55Just at what I wanted to point

Friday 7th March, 2014

Geomol: 17:50Would it makes sense to let
parse "abc" [3 char!]
be the same as
parse "abc" [3 skip]
Geomol: 18:01Maybe letting this be true is better:
parse "abc" [word!]
Like this is true:
parse "123" [integer!]

Sunday 9th March, 2014

DocKimbel: 21:44Your opinion is welcome: COMMENT in PARSE
https://github.com/red/red/issues/724
Andreas: 22:12Related CureCode issue for COMMENT in PARSE:
http://issue.cc/r3/1966
Gregg: 22:55Opinion posted.

Monday 28th April, 2014

Rondon: 21:11Hi Folks, I'm having problem to parse the '&' commercial symbol. I'm using web-to-plain.r from rebol.org, but the 'Inc names, I'm having problem to parse it.. any clues ?

21:13My problem is because I have some html entities starting with &, but the problem is to find just companies such as AT&T, A&E, Film&Arts and transform this loner '&' to '&'

Tuesday 29th April, 2014

NickA: 02:20Rondon, so, the company names are always sandwiched between two other characters -is that correct? Are the html entities always characterized by a different matching pattern (not sandwiched the same way)?
Arnold: 05:27We have a company/restaurant in Holland that is called Keuk& (translation Kitchen). Otoh the amp is not allowed in urls is it?
An idea might be to hardcode these few examples and transform them in an extra parse step, or just before returning the value from the db.

05:29(forget the remark about url and &).
Rondon: 12:25yes.. Nick

12:28I will have to scan all the "&" and compare this with html entities (&, ´) if those two words between '&', I have to keep those words and replace '&' with "&".
Rondon: 12:33I was trying to make a patch to web-to-plain.r from rebol.org

Tuesday 10th June, 2014

Tomc: 23:09@Rondon would love to see what you come up with, web-to-plain.r changes web encoded chars to accii plaintext so emitting an "&amp" in place of an "&" in the input would be counter to its nature. maybe describe the problem a bit more, what the input is and what needs to be changed in the output. Just looking at it now (after a decade) I think I would at least change the call from parse to parse/all

Tuesday 4th August, 2015

Endo: 10:45How do I use NOT in PARSE in R2?
;On R3
>> parse "a" [not "b" skip]
== true
On R2?
Pekr: 10:59Is 'not availabe in terms of R2 parse at all?
Endo: 11:02Better asking; how do I do on R2 something like:
PARSE/all "abc" ["a" not "x" "c"] ;==true
@Perk: No, unfortunately not.

11:02>> parse "x" [not "a"]
** Script Error: Invalid argument: ?native?
Endo: 11:13Normally I won't compare with one char, so using a complemented charset is not useful for me.
not-a: complement charset "a"
parse/all "x" [not-a] ; == true
Geomol: 11:13Won't this work?
>> not-x: complement charset "x"
>> parse "abc" ["a" not-x "c"]
== true
>> parse "axc" ["a" not-x "c"]
== false
Pekr: 11:14Geomol just beat me to that :-)
Endo: 11:14:-) I beat you both :P
Geomol: 11:16Why is complemented charset not useful?
Endo: 11:16Because I need to NOT a word.

11:17Something like:
>> parse/all "this" [some ["this" (print "ok" halt) | skip] ]
ok
>> parse/all "this" [some ["that" (print "ok" halt) | skip] ]
== true

11:17But I don't know how to stop PARSE in the first example.

11:17Instead of HALTing.
Geomol: 11:40to end 1 skip

11:40Which will always return false, I think.

11:41or just: to end skip
Endo: 11:48Sorryi I confused I think, how do I write "except this one" like rule.
>> not-four: ["four" to end skip]
== ["four" to end skip]
>> parse/all "one two three" ["one " not-four " three"]
== false
Geomol: 11:51>> parse/all "one two three" ["one " [not-four | to " "] " three"]
== true
Or use block parsing, if that's an option.

11:51argh :)
Doesn't work.

11:55I guess, you need to include all the ok possibilities?
Endo: 11:55Block parsing could be useful but I'm parsing huge SQL files which are not LOADable (and different formats from each other).
Switching to R3 is easier.

11:56I think I need to play with index positions during the parse.

11:56Thank you for your time Geomol.
Geomol: 11:59welcome
Endo: 13:17I'll go with regular expression, ^((?!four).)*$
gives the lines does not contain "four".

Wednesday 5th August, 2015

sqlab: 07:22not-four: [[(not-four/2: [])  "four" (not-four/2:  [thru end skip] ) | to " "]  []]
>> parse/all "one two three" ["one " not-four  " three"  ]
== true
>> parse/all "one four three" ["one " not-four  " three"  ]
== false
Gabriele: 08:49A variation of sqlab's approach:
>> space: [some #" "]
== [some #" "]
>> parse/all "one two three" [(fail?: none) "one" space ["four" space (fail?: [end skip]) | to #" " space] fail? "three"]                                                == true
>> parse/all "one four three" [(fail?: none) "one" space ["four" space (fail?: [end skip]) | to #" " space] fail? "three"]                                               == false
>> parse/all "one five three" [(fail?: none) "one" space ["four" space (fail?: [end skip]) | to #" " space] fail? "three"]                                               == true
Gabriele: 09:04If your case is more specific maybe it can be done in a different way, like checking for the condition after parsing, or filtering out the input you don't want in advance, and so on.

09:07In Topaz you could do something like:
>> parse [one two three] ['one either 'four [(false)] [skip 'three (true)]]
== true
>> parse [one four three] ['one either 'four [(false)] [skip 'three (true)]]
== false
(no string parsing yet so I used a block to illustrate)

Friday 7th August, 2015

Ladislav: 12:31Endo, you can try my parse enhancements at
http://www.rebol.org/view-script.r?script=parseen.r&sid=ypnz89xc

Friday 5th February, 2016

Arnold: 20:06I have an SQL text with arguments in the form ":argument_1" How do I get a list of the used arguments used in this SQL using parse?

Saturday 6th February, 2016

Endo: 12:58Something like this?
>> digit: charset [#"0" - #"9"]
>> alpha: charset [#"a" - #"z" #"A" - #"Z"]
>> alphanum: union alpha digit
>> validchars: union alphanum charser [#"_"] ;put any other valid chars here
>> sql: {select * from table where a = :param1 and b=:param2     or x=3}
>> parse/all sql [some [thru ":" copy p some avlidchars (print p) | skip]]
param1
param2

12:59You may need to cleanup comments, You can use something like:
remove-sql-comments: has [m n] [
    parse/all read clipboard:// [
        some [
            m: "--" to newline n: (remove/part m n) :m
        |
            m: "/*" thru "*/" n: (remove/part m n) :m
        |
            skip
        ]
    ]
]

Monday 8th February, 2016

Arnold: 12:16Thank you Endo. This is very useful.
Endo: 15:33It is not complete, it doesn't care about comments in strings, but you got the idea.

Thursday 31st March, 2016

szeng: 14:59Can anybody help me to replace all of "on-init" in the parttern of "space on-init non-word" with "abcd" in a string?

15:00I've tried

15:01space: charset [#" " #"^-"]
word: charset [#"a" - #"z" #"A" - #"Z" #"-"]
non-word: complement word
on-init-rule: [
    space mark: "on-init" non-word (
            remove/part mark 7 ;remove on-init
            insert mark "abcd"
            )
]
parse/all inp: {abcasdfasdf on-init
a on-init
b
} [
    any [
        thru on-init-rule
    ]
]

15:01it failed with:

15:02** Script error: PARSE - invalid rule or usage of rule: make bitset! #{0040000080}
** Where: parse do either either either -apply-
** Near: parse/all inp: {abcasdfasdf on-init
a on-init
b
} [
    any ...
>> q
DocKimbel: 15:12Here is a working version:
space: charset [#" " #"^-"]
word: charset [#"a" - #"z" #"A" - #"Z" #"-"]
non-word: complement word
on-init-rule: [
    space mark: "on-init" non-word (
        remove/part mark 7 ;remove on-init
        mark: insert mark "abcd"
    ) :mark
]
parse/all inp: {abcasdfasdf on-init
a on-init
b
} [some [on-init-rule | skip]]
szeng: 15:14Thanks Doc, I'''' give it a try

15:15Yes, it works. Thanks!
DocKimbel: 15:18You're welcome.

Wednesday 9th November, 2016

Endo: 08:03parse/all #{010203} [thru #{03} (print ".")] ;works on R3 and Red but fails on R2, any workaround for this?
Arnold: 09:58There is no refinement all. Leave that out and the output is like the output for R3 with all refinement.
Rebolek: 10:01Arnold, there is /all refinement in R2.
sqlab: 10:39@Endo
parse/all to-string #{010203} compose [thru (to-string #{03}) ([(print ".")]) ]
Endo: 10:51So is the only workaround parsing binary! is converting to string!?
Although this one works, it looks parsing with binary! works but TO / THRU doesn't.
R2> parse/all #{010203} [#{010203}]
== true
Arnold: 12:51Sorry I only use a Red version from before the libRed changes. that is why I got the message
red>> parse/all #{010203} [thru #{03} (print ".")]
*** Script Error: parse has no refinement called all
*** Where: parse
Endo: 15:42Sure, there is no /all in Red, it is default. I meant the difference of TO with binary!.
DocKimbel: 15:56Using a char! or string! as matching target works on R2:
parse/all #{010203} [thru #"^(03)" (print ".")]
parse/all #{010203} [thru "^(03)" (print ".")]
Gabriele: 19:59You can also use AS-STRING instead of TO-STRING so that there is no conversion really going on.
>> bin: #{010203}
== #{010203}
>> str: as-string bin
== "^A^B^C"
>> append str "A"
== {^A^B^CA}
>> bin
== #{01020341}

Thursday 10th November, 2016

Endo: 06:44Thank you all, using char/string or as-string looks good solutions.

Wednesday 25th April, 2018

GiuseppeC: 18:41I need to parse the following string "5.1+2+1" to split the single numeric values (REBOL2 here)
I use the following code:
data: "5.1+2+1"
parse data [some [copy percentuale some cs (print ["Percentuale" percentuale])  to   ["+"  | none] skip] to end]
and I get an error :
** Script Error: Invalid argument: + | none
** Near: parse data [some [copy percentuale some cs (print ["Percentuale" percentuale]) to ["+" | none] skip] to end]
While this works:
data: "5.1+2+1"
parse data [some [copy percentuale some cs (print ["Percentuale" percentuale])  to   "+"  | none skip] to end]
Percentuale 5.1
Percentuale 2
Percentuale 1
Seems <to [ "+" | none]> is not allowed but  <to   "+"  | none> is !
Also, if I use END in palce of NONE :
data: "5.1+2+1"
parse data [some [copy percentuale some cs (print ["Percentuale" percentuale])  to   "+"  | END skip] ]
I get only:
Percentuale 5.1
Sunanda: 20:21Given the special, simple case of splitting a string on a single character, you could use this simpler form:
    data: "5.1+2+1"
    == "5.1+2+1"
    parse data "+"
    == ["5.1" "2" "1"]
GiuseppeC: 20:25"+" could be "+" or "-"
GiuseppeC: 20:32And later I need to store the sign

Thursday 26th April, 2018

Ladislav: 15:06Hi, Giuseppe. You correctly concluded that to ["+" | none] is not allowed in the Parse dialect at present. If it were allowed, it would still not be what you want, I guess. Don't you want something like to "+" | to end ?

15:09... or maybe to "+" | skip to end to eliminate the "already at end" case

15:10... if you want to eliminate it

15:13BTW, your some cs code is not understandable, unless you show how it is defined
GiuseppeC: 20:01cs: charset ["." #"0" - #"9"]
I forgot to write it !
GiuseppeC: 20:14My goal is to extract the each % , buil a block with them and then calculate the overall value when applied to a number
Example:
5.1%+2%+2% of 150
I need to calculate
5.1%*150=V1
2%*150=V2
2%*150=V3
Total: V1+V2+V3
The concatenated percentag are taken from a DB and they are without the percentage sign and as STRING VALUE.

Friday 27th April, 2018

DideC: 14:27I guess you already solve it, but a first step is :

14:27parse data [(out: copy []) some [copy percentuale some cs (append out load percentuale) copy op ["+" | "-"] (append out load op) | none ski
p] to end (probe out)]

AltME: Parse

Messages

Wednesday 5th March, 2014

Thursday 6th March, 2014

Friday 7th March, 2014

Sunday 9th March, 2014

Monday 28th April, 2014

Tuesday 29th April, 2014

Tuesday 10th June, 2014

Tuesday 4th August, 2015

Wednesday 5th August, 2015

Friday 7th August, 2015

Friday 5th February, 2016

Saturday 6th February, 2016

Monday 8th February, 2016

Thursday 31st March, 2016

Wednesday 9th November, 2016

Thursday 10th November, 2016

Wednesday 25th April, 2018

Thursday 26th April, 2018

Friday 27th April, 2018

Last message posted 312 weeks ago.