UKC

VB.net help please

New Topic
This topic has been archived, and won't accept reply postings.
 Climber_Bill 28 Mar 2017
Hi,

This is a piece of work my son is struggling with and I thought I would ask the Oracle that is UKC for help.

How would the following string:
"This is a sentence , don 't you know !" be converted to:
"This is a sentence, don't you know!".

In other words, removing the whitespaces that come before the punctuation only.

I tried a regex but that got rid of all whitespaces, which is wrong.

Thanks for any help.

TJB
 john arran 28 Mar 2017
In reply to Climber_Bill:

I don't know if there's any fancy new way to do this - quite likely there is nowadays - but the old way would be to parse the string, copying into a new string each character that isn't 'whitespace followed by anything other than a letter or a digit' and ignoring the others.
 MeMeMe 28 Mar 2017
In reply to Climber_Bill:

I don't know VB.net but you have to work out what you are trying to do in terms of logic before you write some code.

E.g. I need to remove whitespace that has a letter before it and a punctuation mark after it.

Then write the code that will implement your solution (You can solve it with the correct regex).
 Si_G 28 Mar 2017
In reply to Climber_Bill:

If he's been set it as homework, you really need to know the point of the lesson - there are different approaches (as others mention).
 MeMeMe 28 Mar 2017
In reply to Climber_Bill:

Actually scratch that, removing whitespace before any punctuation (your solution) should work fine.
What's your regex? No idea of the VB.net regex format but you need to match on 1 or more whitespace characters followed by 1 punctuation characters and convert to the same thing without the white space (you'll need to match the punctuation as some kind of group so you can put it backing the converted text if you know what I mean).
 johnmctighe 28 Mar 2017
In reply to Climber_Bill:

You can use the Split function to tokenise your string. Split by white space into an array and then you can rebuild the string word by word with a for each loop. If your string is very long and performance is important use the string builder class instead.
 elsewhere 28 Mar 2017
In reply to Climber_Bill:
(W+)[.:;,?!]

1 or more white space preceeding punctuation.

Ukc won't let me put the required slash in front of the W
Post edited at 10:20
 Mike-W-99 28 Mar 2017
In reply to Climber_Bill:

A regex will work. Look up backreferences on how to make sure the punctuation remains, it's quite clever how it works.
OP Climber_Bill 28 Mar 2017
In reply to elsewhere:

Thanks everyone for the replies, much appreciated.

originalSentence = Regex.Replace(originalSentence, "(W+)[.:;,?!]", " ").

Returns;

This is a sentence don 't you know

Still not quite right.
 Si_G 28 Mar 2017
In reply to Climber_Bill:

You need to remove the whitespace which directly prefixes punctuation only.
 wintertree 28 Mar 2017
In reply to Climber_Bill:

I'd use a simple for loop iterating over the text copying as I went to a new string using a variable as a flag to keep track of if the last character was a letter, space or punctuation character, and use this to decide if the next one should be copied. Basically a trivial finite state machine doing the copy.

If VB.Net treats strings as high level immutable objects this would have awful performance as it thrashed object allocation and re-copies the growing string... I have no idea what VB.Net does mind you. Just a blind guess.
 elsewhere 28 Mar 2017
In reply to Climber_Bill:
An untried suggestion

Put the round brackets around the square brackets.
Add apostrophe and any other characters classed as punctuation.
Put $1 inside double quotes of third argument.

https://msdn.microsoft.com/en-us/library/xwewhkd1(v=vs.110).aspx
Post edited at 11:02
 Swig 28 Mar 2017
In reply to Climber_Bill:

Regex is horrible even when it works. Worse that VB.NET.

Lots of possibilities though...

Tokenising using Split is awkward because you actually want the tokens to remain.

How about repeatedly attempting a operations like
s = s.Replace(" ,", ",")
Until the string stops getting shorter?

Strategies where you have a discovery pass to identify changes followed by a single pass to remove might be a goer. Discovery pass might involve working through the string from back to front as it is spaces before punctuation that you want to nail.
 Swig 28 Mar 2017
In reply to wintertree:

> If VB.Net treats strings as high level immutable objects this would have awful performance as it thrashed object allocation and re-copies the growing string... I have no idea what VB.Net does mind you. Just a blind guess.

Yep, as you guess. Stringbuilder to help with building strings as someone mentioned above
OP Climber_Bill 28 Mar 2017
In reply to elsewhere:

That worked.

Changed the regex to;

originalSentence = Regex.Replace(originalSentence, "W+([.:;,?!'])", "$1").

Thanks again to everyone, much appreciated.
OP Climber_Bill 28 Mar 2017
In reply to Climber_Bill:

My son also passes on his thanks.

This was part of a much larger piece of GCSE coursework that he has done largely on his own, but was struggling with this one component.

He has been worrying about it for weeks and my constant advice to just go climbing and not worry did not go down too well, from the looks I received. Today's youth eh!

Regards,

TJB.

New Topic
This topic has been archived, and won't accept reply postings.
Loading Notifications...