[software] [catdoc] [tcl] [geography] [old things]

Extended split command

Being forced to use perl instead of Tcl, I've noticed that Perl split command is much more powerful than Tcl one - it allows to split strings on arbitrary regexps rather than on particular char, and, optionally, allows to put element separators in the resulting list as separate elements.

Fortunately, regexp command in Tcl is powerful enough to implement Perl-like split just in few lines of Tcl code.


proc xsplit [list str [list regexp "\[\t \r\n\]+"]] {
    set list  {}
    while {[regexp -indices -- $regexp $str match submatch]} {
	lappend list [string range $str 0 [expr [lindex $match 0] -1]]
	if {[lindex $submatch 0]>=0} {
	    lappend list [string range $str [lindex $submatch 0]\
		    [lindex $submatch 1]] 
	set str [string range $str [expr [lindex $match 1]+1] end] 
    lappend list $str
    return $list

This command behaves much like Tcl built-in split, but it takes regexp as second argument, and defaults it to arbitrary amount of whitespace. If regexp contains parentesis, text, which matches them would be inserted in resulting list between splitted items as separate elements.