Group: microsoft.public.word.vba.general
From: Jay Freedman
Date: Friday, February 15, 2008 8:18 PM
Subject: Re: Search all the characters then get range for some of them

Ah, the description of the problem is now significantly different from the
original post. And I have a strong feeling that we're still not looking at the
"real" description.

What are you trying to accomplish, stated in plain English without reference to
Word and its features? Are you trying to determine (a) how one document was
edited to make another document, (b) whether one document might have been
plagiarized from another document, (c) what similarities exist in two possibly
unrelated documents, or something else?

How will you know what phrases or paragraphs to compare? I'm sure you don't want
to go word by word (and even less do you want to go letter by letter); you'd get
tons of hits on "the", "and" and similar common words.

The question of how you store words internally in the macro is completely
irrelevant until you know what you want to do with them.

On Fri, 15 Feb 2008 13:53:01 -0800, leverw
wrote:

>OK, Jay and Doug, I mis-understood your solutions. BUT, our application (a
>document comparison between Office documents) requires we know all the words
>in a document, so we can compare them against another document. We also need
>to know the positions of each word in a document so we can gray it out later
>if there is a matching phrase/paragraph in another document.
>
>So to re-phrase my question better: how do I load all the words/characters
>and their positions from a document, so later I can gray them out?
>
>Again, millions of thanks in advance.
>
>
>"Jay Freedman" wrote:
>
>> What Doug was suggesting was _not_ loading the range of the document into an
>> array, but loading an array with the list of words that you need to search
>> for. The "array" can be just a single Variant type variable. There are
>> several ways to load it. For a small amount of data you could use the Split
>> function to make an array from a string:
>>
>> Dim WordsToFind As Variant
>> WordsToFind = Split("one,two,three,four", ",")
>>
>> Once you have this array, you can rewrite the macro like this:
>>
>> Sub demo2()
>> Dim WordsToFind As Variant
>> Dim idx As Long
>> Dim oRg As Range
>>
>> WordsToFind = Split("one,two,three,four", ",")
>>
>> For idx = 0 To UBound(WordsToFind) ' <===
>> Set oRg = ActiveDocument.Range
>> With oRg.Find
>> .ClearFormatting
>> .Replacement.ClearFormatting
>> .Text = WordsToFind(idx) ' <===
>> .Replacement.Text = "^&" ' same as found
>> .Replacement.Font.Color = wdColorGray35
>>
>> .Forward = True
>> .Wrap = wdFindContinue
>> .Format = True
>> .MatchCase = False
>> .MatchWildcards = False
>>
>> .Execute Replace:=wdReplaceAll
>> End With
>> Next ' <===
>> End Sub
>>
>> For the > 1000 words that you mentioned, I assume that you have a list
>> somewhere, maybe in a Word document or a text file. If you explain where the
>> list is and what separates the words from each other, we can suggest a good
>> way of getting the list into the array.
>>
>> --
>> Regards,
>> Jay Freedman
>> Microsoft Word MVP FAQ: http://word.mvps.org
>> Email cannot be acknowledged; please post all follow-ups to the newsgroup so
>> all may benefit.
>>
>> leverw wrote:
>> > I have only been doing this VSTO programming for the last 4 weeks.
>> > So how do I "... load the words into an array" quickly as you
>> > describe? This is exactly what I need to do. But when I call
>> > Document.Range, it gives me all the word ranges in the document
>> > range, but I have to go through each Word.Range call to get the
>> > actual word, which is not efficient to me, right?
>> >
>> > Many thanks in advance again!
>> >
>> > "Doug Robbins - Word MVP" wrote:
>> >
>> >> It's a simple matter to load the words into an array and then use
>> >> code similar to Jays to interate through the array, and process each
>> >> word in turn.
>> >>
>> >> You probably would not have time to get a cup of coffee while it was
>> >> doing it.
>> >>
>> >> --
>> >> Hope this helps.
>> >>
>> >> Please reply to the newsgroup unless you wish to avail yourself of my
>> >> services on a paid consulting basis.
>> >>
>> >> Doug Robbins - Word MVP
>> >>
>> >> "leverw" wrote in message
>> >> news:8373C72E-2042-4897-8B61-0CAACC9543F7@microsoft.com...
>> >>> Thanks for your answer. But in our case, we have many words (>>
>> >>> 1000) that
>> >>> we need to search and replace (actually just gray out the text).
>> >>> Doing it one at a time is not very scalable, right? I thought I
>> >>> could get the entire
>> >>> text and look for them myself. If some are positioned
>> >>> consecutively, I can
>> >>> gray out several words at a time.
>> >>>
>> >>> Thanks again.
>> >>>
>>
>>
>>

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org
Email cannot be acknowledged; please post all follow-ups to the newsgroup so all may benefit.