Another single letter: deprecate and depreciate

16 Thursday Apr 2015

Tags

deprecate and depreciate have frequently confused me. I have a fairly good idea of the meanings of each, but often am now sure which one I really want to use. One factor inhibiting is a financial background where depreciate has such a specific meaning that I sometimes neglect its broader meanings and possibility of wider use. So let’s look at these:

deprecate, a verb with definitions from three difference sources:

1) express disapproval of

2) another term for DEPRECIATE (sense 2: disparage or belittle (something))

to express severe disapproval of another’s action

1) to express earnest disapproval of

2) to urge reasons against; protest against (a scheme, purpose, etc.)

3) to depreciate; belittle

4) {archaic} to pray for deliverance from

Now right away we have the issue that my first source effectively says that deprecate is the same as sense 2 of depreciate, whose definitions we’ll show below:

1) diminish in value over a period of time

<special usage> reduce the recorded value in a company’s books of (an asset) each year over a predetermined period

2) disparage of belittle (something)

1) to reduce the purchasing value of (money)

2) to lessen the value or price of

3) to claim depreciation on (a property) for tax purposes

4) to represent as of little value or merit; belittle

The first pair of definitions is from the same source as the definition of deprecate and thus the 2) definition (first one) is what was being mentioned in the second definition of deprecate.

So these sources are saying depreciate really is the same as deprecate but with the added meanings of the financial sort. I wonder how this happened – a word gains new meanings but also gains a letter.

A usage note in dictionary.reference.com provides the following:

An early and still the most current sense of deprecate is “to express disapproval of.” In a sense development still occasionally criticized by a few, deprecate has come to be synonymous with the similar but etymologically unrelated word depreciate in the sense

That’s interesting but not quite enough. What more can I find?

For depreciate two sources list these etymologies:

Late Middle English: from late Latin depreciat- ‘lowered in price, undervalued’, from the verb depreciare, from Latin de- ‘down’ + pretium ‘price’.

and

1640-50; < Late Latin dēpretiātus undervalued (past participle of dēpretiāre, in Medieval Latin spelling dēpreciāre), equivalent to Latin dē- de- + preti (um) price + -ātus -ate

IOW, these both cover the financial sense of the term, not the disaparage or express disapproval sense, so let’s see about deprecate, from the same two source:

Early 17th century (in the sense ‘pray against’): from Latin deprecat- ‘prayed against (as being evil)’, from the verb deprecari, from de- (expressing reversal) + precari ‘pray’.

and

1615-25; < Latin dēprecātus prayed against, warded off (past participle of dēprecārī), equivalent to dē- de- + prec (ārī) to pray + -ātus -ate

So it appears each word started with just one sense and these are quite different, but over time (unclear when/how) deprecate has subsumed some of the meaning of depreciate. Sounds to me this is really the same confusion I’ve always had.

So, both through logic and observing the note (a sense development still occasionally criticized by a few) the few are right, keep them separate. When you’re talking about loss of value of something, use depreciate and when you wish to use the meaning of disparaging or disapproving use deprecate. That’s actually what I thought (sorta) but now I’ve convinced myself of this via a little research.

All, as a word-still-in-progress I’m trying to use my synonym datafile and code to see what it would indicate is the degree of similarity, but that will be another post.

anaphora antistrophe aposiopesis ellipsis oxymoron tropology: really the same?

09 Thursday Apr 2015

Posted by dmill96 in comment, wod

≈ Leave a comment

Tags

meanings, WordOfTheDay

I’ve been accumulating synonyms, from a particular source, for the words in my vocabulary to see if I can do various processing on the synonym lists to determine “similarity” between words, a metric I’ll then use in various ways in composing quizzes. One benefit of doing the tedious process of data entry myself is I begin to notice patterns, especially characteristics of this particular source of synonyms. Do these patterns then suggest some flaws in that source?

According to my synonym source these words have exactly the same synonyms which would imply they are very similar, if not identical – but is this true? And these words: metonymy, onomatopoeia, malapropism, alliteration, adumbration have almost exactly the same synonyms as the first six. Eleven almost identical words – seems crazy, so let’s see how close their definitions are:

anaphora

n.
1. [grammar] the use of a word referring to or replacing a word used earlier in a sentence, to avoid repetition
2. [rhethoric] the repetition of a word or phrase at the beginning of successive clauses

antistrophe

n.
the second section of an ancient Greek choral ode or of one of division of it

aposiopesis

n.
[rhetoric] the device of suddenly breaking off in speech

ellipsis

n.
the omission from speech or writing or a word or words that are superfluous or able to be understood from contextual clues

oxymoron

n.
a figure of speech in which apparently contradictory terms appear in conjunction

a phrase made by combining two words that are contradictory or incongruous

a figure of speech by which a locution produces an incongruous seemingly self-contradictory effect

tropology

n.
the figurative use of language
<special usage> (Christian theology} the figurative interpretation of the scriptures as a source of moral guidance

So of these ellipsis and anaphora do seem quite close, but the rest are not that connected. aposiopesis has some connection with the the ellipsis and anaphora, but antistrophe and tropology seem quite unrelated and also very specialized meanings. And of course oxymoron, a widely used word seems to have essentially no connection (other than being descriptive of speech patterns).

Let’s look at the other five:

metonymy

n.
the substitution of the name of an attribute or adjunct for that of the thing meant, for example suit for business executive, or the track for horse racing
[rhetoric] a figure of speech that consists of the use of the name of one object or concept for that of another to which it is related, or of which it is a part, as “scepter” for “sovereignty” or “the bottle” or “strong drink” or “count heads (or noses)” for “count people”

onomatopoeia

n.
the formation of a word from a sound associated with what is named
<special usage> the use of such words for rhetorical effect

words that sound like, or suggest, their meaning

1. the formation of a word, as cuckoo, meow, honk, or boom, by imitation of a sound made by or associated with its referent
2. a word so formed
3. the use of imitative and naturally suggestive words for rhetorical, dramatic, or poetic effect

malapropism

n.
the mistaken use of a word in place of a similar-sounding one, often with unintentionally amusing effect

deliberate misuse of a word or mangling of the English language, often done for comic effect

1. the unintentional misuse of a word by confusion with one of similar sound, esp. when creating a ridiculous effect
2. the habit of misusing words in this manner

alliteration

n.
the occurrence of the same letter or sound at the beginning of adjacent or closely connected words

1. the commencement of two or more stressed syllables of a word group either with the same consonant sound or sound group, or with a vowel sound that may different syllable to syllable
2. the commencement of two or more words of a word group with the same letter

the repetition of similar sounds, especially at the beginnings of words, in written speech or the spoken word

adumbration

v.
report or represent in outline
<special usage> indicate faintly; foreshadow or symbolize; overshadow

1. to produce a faint image or resemblance of; to outline or sketch
2. to foreshadow; prefigure
3. to darken or conceal partially; overshadow

So adumbration really has no connection, almost at all with any of the other ten, so synonyms as similarity really fail there. metonymy seems to have some connection, but onomatopoeia and malapropism (which are vaguely related to each other) and alliteration have very little connection to any of the first six.

Now at least ten of the words have something to do with patterns of speech (but so do lots of other words) so I think this synonym source is doing a disservice to treat these as similar as they do (doing a little reverse engineering I think this apparent similarity is an attribute to their data structures and methods of retrieving “synonyms” by actually grabbing all the synonyms of a particular sense/meaning of the word, i.e. really in terms of the experiment I’m doing with synonym trees, the source is doing something similar, i.e. including all the next degree of separation.

Now while this pattern is particularly strong for these eleven words it is also a device they use in other words, even though those might only be a fraction of the total list of synonyms. In short, they’re generating too much overlap.

Now in terms of my practical requirements detecting this similarity would be a good reason to include some of these terms in a quiz for one of the others, not to exclude them because they might have the same definitions. So I think I can use synonyms (possibly both list vector comparison and trees) as measures of similarity, but then I’m actually back to the original possible issue and that is comparing definitions themselves for similarity (not literally, but approximately, as with search). At least the synonyms might provide a smaller set of definitions to have to examine and that’s good as the comparison is slow, per definition, and the comparison for the entire vocabulary is N^2 so I can’t just do brute force, especially during real-time generation of the quiz.

As to the value of any of these words as Word Of The Day, most of them seem sufficiently obscure to be something any of us would rarely use, although obviously alliteration, malapropism and oxymoron seem more useful. adumbration, onomatopoeia and ellipsis seem like the kind of words you might encounter in some advanced test, so there these six are probably worth learning.

There is too much of this today: perfidious

05 Sunday Apr 2015

Posted by dmill96 in wod

≈ Leave a comment

Tags

WordOfTheDay

I haven’t done a WordOfTheDay for a while so I picked this one. Writing these posts, which seem to attract little attention (after all search for a word gets many more hits from the vast number of dictionaries in the Web), but it’s useful for me to attempt to use these words, just as another reminder to inculcate the word. Already today I missed invious (remember that one?) which was in a previous post, but is a word I’ve had no chance to use in any discourse (at least until I do my annual camping trip to wilderness).

So here’s the definition of perfidious:

adj.
{poetic/literary} deceitful and untrustworthy

adj.
deliberately faithless; treacherous; deceitful

While the political knuckle-draggers would undoubtedly apply this to our President or even all progressives (and certainly climate scientists) I think it more apposite when applied to either opportunist Ted Cruz or even more so BibiNet who thinks he can manipulate U.S. politics just as he deceived his own population with the “Arabs are coming” fear-mongering (his later apology is crocodile tears, totally fake, he did what he had to do to arouse prejudice just to win an election).

Now somehow perfidious sounds a bit like what it means (just as execrable does) so it should be easy to remember and certainly any comment on current affairs, esp. in this political season, there will be plenty of opportunity to use this word, so give it a try, no matter who you apply it to.

And here’s all the synonyms: betraying, deceitful, deceptive, faithless, false, insidious, misleading, recreant, shifty, slick, traitorous, undependable, unfaithful, unloyal, unreliable, untrustworthy; most of which seem to fit the definition, but a few don’t which is the problem of using synonyms to provide “meaning” to words.

btw: Don’t confuse perfidious with perfervid (which I just did) as it means:

very fervent; extremely ardent; impassioned

Not much drilling lately

05 Sunday Apr 2015

Posted by dmill96 in statusreport

≈ Leave a comment

Tags

progress

The combination of my major restructuring of my applications, my experiments with synonyms (still incomplete), and lately the pressing deadline of getting taxes done (three sets, not just mine) has severely eaten into my drills (plus a bit of boredom as well). That provides, indirectly, a bit of experiment by undermining the short-term memory I accumulate with high rates of drills so this pause is a chance for my long-term stats (accuracy per word on drills) to get closer to longer-term knowledge. The history, which is impossible to simulate (or even really test the code) is essential to this application so my lack of activity is actually helpful. So here’s the story, graphically:

The blue markers and line are my actual daily drills (on the full vocabulary, now with 13,273 drills, whew a lot, about 33 hours spent on that over 134 days, or averaging about 15 minutes/day). All those zero starting around day 115 has dragged down my 7-day moving average (the gold markers and line) to the lowest since I started all these drills. Now I have done a few others (about 2000) which I’ll eventually merge into this set but the other drills are on different word sets and will distort things a bit.

Looking at the data a different way:

The flat part of the curve is the recent “pause” with only tiny growth (compared to previous high rate of drills (in essence, this is the integral of the previous graph). So I’ve fallen below my long-term trendline (the dotted regression line) and need to catch up a bit.

And looking at this a different way:

This is the distribution of #drills (horizontal axis) and counts (vertical axis). Since my biased picker should tend to drill more on the less drilled words the gold line (today, after 150 drills) should be lower than the blue line (before today’s drills) on the left side and higher on the right side. But 150 drills is just a bit over 1% of the total number of drills so one day makes little difference. BTW: the bulge in the lower #drills was my addition to this drill file of a few hundred newly setup words just a while back (about a month) so these recent additions still haven’t blended into the long-term pattern. I also found a anomaly in the code (the biased picker) exactly at 10 previous picks and so I’ve fixed that and the large peak at 10 is slowly coming down. Which, btw, is the point of some of these graphs, the only way to look at this huge amount of data and determine the effects of code changes and/or vocabulary content changes, so again a way I can validate what my application is doing (hard to tell just from doing drills which now nearly 2000 words, or, IOW, only doing about 1/10th of them on any given day, hard to see patterns).

So the work goes on.

p.s. While drills may be down meanwhile I have accumulated 935 synonyms in my collection (some of this count is derivatives or inflections) so that’s a significant portion of my total vocabulary (I’m doing synonyms in a separate file format and different process, not sure, yet, how I’ll add these to my vocabulary or exactly what I’ll use them for). So work continues, just in different places. Meanwhile I’ve been very tempting to attempt to recover (what I can) of my Spanish food terms (some of the data I got was clean) but that’s yet another distraction from all sorts of other unfinished work.

Making progress of vocabulary app rewrite and architecture

29 Sunday Mar 2015

Posted by dmill96 in programming

≈ Leave a comment

Tags

application, progress

Note: I actually get to use the term ‘architecture’ since I actually know what it really means and I’m not using it in the dumbed-down sense many others do.

Amazingly I’m nearly nine months into building this vocabulary application and it has ballooned into a far larger project than I originally imagined. And recently it “broke” due to a problem I’ve now fixed (my anti-virus software was breaking builds in Visual Studio). So that problem, which while unfixed, forced me to (painfully) recreate the application from scratch (cutting-and-pasting old source into a new VS project) was the primary motivation for my rewrite, I’m glad it happened.

As first I just started a gradual conversion, in-place, i.e. within existing code. Then I decided to flush that and create the application tabula rasa, building the basics of the architecture pattern first, then porting the existing code into that framework (a lot I don’t want to rewrite, but will have to restructure).

Now my last career job was a software architect. I know how to do these things “right”. But architecture usually applies to long-lived products created by teams of hundreds using multiple servers. That’s not exactly what my recreational project is. BUT, some of those ideas can apply, even if a bit tedious.

So I’m usually the common and conventional MVC pattern. I’ve begun, but not completed the abstraction of my XML into a set of classes where only the lowest base classes have any XML processing in them (easy to enforce, don’t put using System.Xml in any place other than one class.cs file). The V will be a rewrite of my somewhat messy (interally, not too bad externally) of all the .Net controls within a Windows form, but NOW, it will not have any logic or knowledge of the data, instead communicating only to the Controller via messages (passed in memory, but could go through communication channel, like using HTTP and JSON or XML). That means I’ll kill up all my UI code so it can easily be replaced (just in case Visual Studio has some other gotcha), or even that I could port all this to a server and let you guys play with it.

To do this I’ve how to refresh my memory on some details of architectural concepts and do several rewrites of my infrastructure classes, but I’m getting there. Another day or so and most of the infrastructure, except for a few tweaks, should be there and so then I can migrate all my code that actually does the application (the CRUD parts) into the Controller. While this will be a lot cleaner it is fairly abstract and it’s much more work to code than the spaghetti code I had before but it will pay off.

And my quiz code, even within current structure, is a mess and needs to be rewritten, plus significantly extended in functionality. And I need to finish all the fiddling with synonyms (plus do POC of trying to find “similarity” in the text of the definitions) so I can get all that worked in. So I can visualize months more work on this. If I had to put a price tag on all this (paying people to do all this, including me) I’d guess this application is now in the 200K$ cost. Wow, maybe I do need to do additional steps of conversion to “product” (at least a website) so I can make this worth something.

But at least I get to stay active in all this, occasionally having to refresh my memory of stuff I already knew, so I retain my competency. Unlike never forgetting how to ride a bike (even if you do get out of shape) software development you can forget, not to mention be seriously out of date as the world moves forward.

So I’ll keep on keepin’ on.

A different new experiment – 3

26 Thursday Mar 2015

Posted by dmill96 in comment, programming

≈ Leave a comment

Tags

application, meanings

Well I got a few results (not very good) of attempting to score how “similar” two words are by comparing their synonym lists and assigning a score. My first attempt revealed a clear problem in the weights I used in generating the score so I changed that and here are some results.

My list of words with synonyms goes to the most degrees of separation starting from the word ‘denounce’ so that was my test.

denounce has these five words as its most similar: excoriate, vituperate, censure, reprehend and reprove. These words can also be ranked by my scoring via expanding the synonym tree and so the ranking is: excoriate (20th), vituperate (18th), censure (1st), reprehend (16th) and reprove (9th).

OTOH, via scoring with the synonym tree yields these as the top five matches (rank by synonym list included in () ): censure (3rd), reproach (15th), rebuke (18th), upbraid (8th), and castigate (22nd).

IOW, only one word is in common, in the top five best results, between the two algorithms so the obvious question is which top5 is best and/or what really are the best synonyms (or, actually, what I really want, “most similar”).

Also it’s interesting to consider whether my algorithm and in fact human commonsense would make synonyms commutative, that is if X is the best synonym of Y is Y X’s best synonym. It’s hard to say, the commonsense, but I’d say no. So what does my synonym list scoring say: denounce’s #1 synonym is excoriate but excoriate’s #1 is castigate and denounce is its 6th best. denounce’s #2 synonym is vituperate but vituperate’s #1 is vilify and denounce is #2. denounce’s #3 synonym is censure but censure’s #1 is animadvert and denounce is its 9th ranked. denounce’s #4 synonym is reprehend but reprehend’s top is censure and denounce is its 11th. And finally denounce’s 5th best scored is reprove, but reprove’s first is upbraid and denounce is 16th. Now, based on this extremely limited data there does, however, look like a bit of a pattern, the lower ranked synonyms of denounce also, increasingly (by their rank under denounce) have denounce even lower ranked, so there is some relationship.

If I average the ranks in the two algorithms, my top5 is censure, vilify, upbraid, reproach and reprove, so interestingly vilify which is 6th in one list, 7th in the other makes it into the top five due to the big discrepancy in their rankings.

So let’s look at the definitions of all these:

denounce	publicly declare to be wrong or evil <special usage> inform against
excoriate	1. {formal} censure or criticize severely 2. [chiefly medicine] damage or remove part of the surface of (the skin)
vituperate	blame or insult (someone) in strong or violent language
censure	express severe disapproval of (someone or something), typically in a formal statement
reprehend	reprimand
reprove	reprimand or censure someone
reproach	address (someone) in such a way as to express disapproval or disappointment <special usage> . (reproach someone with) accuse someone of . {archaic} censure or rebuke (an offense)
rebuke	express sharp disapproval or criticism of (someone) because of their behavior or actions
upbraid	find fault with (someone); scold
castigate	{formal} reprimand (someone) severely
vilify	speak or write about in an abusively disparaging manner

reprimand	rebuke (someone), especially officially
rebuke	express sharp disapproval or criticism of (someone) because of their behavior or actions

Boy, is this a bunch of circular definitions, no wonder it’s hard to decide which is which, reprimand is really just rebuke and since rebuke is already ‘sharp’ boy castigate is severe sharp and censure is already severe so excoriate is severely severe (does that mean I’d rather be censured than excoriated? or rebuked instead of castigated?) The key part seems to be the “public” part of this which denounce has and so does censure, reprimand (and thus reprehend or reprove), but rebuke, upbraid, reproach are missing the public aspect. So I’d say it’s a bit of tossup between rebuke and censure but neither of those go past mere “disapproval” and not as far as declaring evil. vituperate and vilify seem the most remote.

But this would be really rough (maybe some English profs or etymologists could chime in) to really decide which is best. For my purposes I can say these words are all “similar” but in terms of my programming issue (avoid a multiple choice quiz where more than one answer could be construed as “correct”) I’m not sure this help. It actually would be a challenging quiz to have these five possible answers:

publicly declare to be wrong or evil

blame or insult (someone) in strong or violent language

express severe disapproval of (someone or something), typically in a formal statement

find fault with (someone); scold

express sharp disapproval or criticism of (someone) because of their behavior or actions

in order to sharpen one’s precise (assuming Oxford is precise) sense of meanings of denounce, vituperate, censure, upbraid and rebuke.

A different new experiment – 2

25 Wednesday Mar 2015

Posted by dmill96 in programming

≈ Leave a comment

Tags

application

Continuing on my series of posts (here and here) about using synonyms to measure similarity between words I’ve accumulated a bit more data and added some code and so can present a bit more on the results. To review I started with the word ‘denounce’ and grabbed all its synonyms. Then for those synonyms that are also in my vocabulary I grabbed their synonyms and so forth, for about four “degrees of separation”. To make this more interesting I used some other selections of words, from my vocabulary, that are not particularly related to the original “synonym tree” for ‘denounce’. So what does this produce?

First I modified my code so I can limit the depth of recursion (starting from a word, get its synonyms and then find their synonyms and so on to the recursion limit), using a weighted scored system to determine the “closet” synonyms in the whole tree. I still need to tune: a) the weights I use in the algorithm, and, b) how deep to go in the tree. Looking at this second issue, here’s a graph to provide some enlightenment on how degrees of separation influence the size of the synonym tree:

Here I’ve taken the counts of how many synonyms (just those in my vocabulary) I found and sorted them and then plotted to get a type of histogram. The lowest cyan colored dots/line is each instance of #synonyms for all 690 words with only one degree of separation (i.e. the direct synonyms of the word). So for about 150 words the number is zero (there are synonyms, just all common words and so not counted). About 650 words have less than 100 synonyms; after that the counts rise with the highest number at about 250. The orange line is two degrees of separation, so naturally higher (but with the same number at zero synonyms). The grey line is four degrees of separation and the yellow line is six degrees of separation. But looking at the dark blue line (fifteen degrees of separation, essentially all my data could have) you see that most words have one of two counts – zero or 630 (all words). That is to say each word is either connected to nothing (again, only synonyms in my vocabulary are counted) or to everything (even more than zero)! Only a few words have a total #synonyms in their entire tree in between (those values around 250 on horizontal axis).

What this says is that in fact most words are eventually connected to every other word (or none) if you expand the synonym tree far enough; IOW, it’s useless to do that (although my scoring will adjust for this). So it’s about the 4th degree of separation where most words have get more and more, approximately linearly per word. Six degrees is beginning to look like 15, i.e. words have connections to most other words, so probably a synonym tree of a depth of 4 is about right. This will be critical when I start using the list of all synonyms found at any depth (and their scores which, typically, are lower, at more degrees of separation) to measure similarity. Exactly HOW I’ll measure similarity is the next part of this experiment.

To see the effect of ignoring common words I allowed them to be counted as well as my vocabulary words. HOWEVER, I don’t plan on all the typing to enter those common words (maybe a few for testing) as there are over 5000 unique common words used as a synonym for at least one of the 690 words where I’ve extracted and coded the synonym list per words. This is relevant too. While a common word which can’t be expanded to its synonyms is a “dead end” in the recursion it can still be counted in the scoring. Here’s some examples:

abase: #found=209, Top 10: (debase 1.793 1) (censure 1.531 3) (disparage 1.422 2) (vilify 1.214 3) (demean 1.168 1) (decry 1.159 2) (malign 1.139 3) (defame 1.113 3) (reproach 1.081 4) (rebuke 1.056 4)

The top 10 highest scored synonyms for ‘abase’ are all in my vocabulary (thus scores > 1.0) whereas with:

abash: #found=20, Top 10: (disconcert 1.112 1) (rattle 1.000 1) (embarrass 1.000 1) (discountenance 0.970 2) (abash 0.894 2) (discomfit 0.782 2) (confound 0.782 2) (deprecate 0.499 3) (disparage 0.351 4) (derogate 0.351 4)

‘abash’ the common word ’embarrass’ is found at first level of separation (by definition only once, still these are unduplicated, despite a lot of duplication in the source) and so has a score of 1.000, whereas ‘discomfit’ (in my vocabulary) is found at two degrees of separation (on more than one branch of the tree, since 0.67 would be score, based on weights in this run, for just a single occurrence). So is ’embarrass’ really a better synonym than discomfit (or confound or deprecate and so forth).

abashment: #found=28, Top 10: (discomfiture 1.224 1) (shame 1.000 1) (mortification 1.000 1) (distress 1.000 1) (discomposure 1.000 1) (confusion 1.000 1) (chagrin 1.000 1) (embarrassment 1.000 1) (disquiet 0.933 2) (vexation 0.908 2)

Here most of the closest synonyms (again based on these weights) are the common words (shame, mortification, distress, discomposure, confusion, chagrin, embarrassment) with other words from my vocabulary as better or worse.

Note: I probably need to modify code to have a third weight (maybe pair) so common words don’t score as high (for as many occurrences) as my vocabulary score, but this is yet another tuning issue before completing this project.

And here

abatement: #found=5, Top 10: (reduction 1.000 1) (discount 1.000 1) (decrease 1.000 1) (decline 1.000 1) (lessening 1.000 1)

Only common words occur at first level of separation so the tree is completely flat and descends to no further levels (I’d have zero synonyms if I only counted words in my vocabulary). So how the common words, used as synonyms, should be handled will be a key factor in my similarity algorithm (TBD).

So here’s a graphical look at the effect:

Here the cyan dots/line represent the counts of synonyms from my vocabulary and the orange line also includes the common words. For the most part there is a relatively constant offset of just a few words. Note that while about 150 words have no synonyms from my vocabulary (the left part of the curves) including the common words immediately has counts > 0 across the range of all words.

Now, if two words ONLY have common words as synonyms and I only created synonyms trees from my vocabulary words how could I ever compare two words for similarity so the common words probably have to factor in my algorithm.

So to do similarity I’d thinking of trying several things, based of idea of comparing distance between “vectors”. We can think of each of the 690 words have two vectors (with only vocabulary words): a) just 0 or 1 (any of the 690 words is a synonym, at whatever degree of separation) or b) the actually weights (and 0 for any missing ones). Normally vector distances is just the square root of the sum of the squares of deltas. So that’s the obvious way to go. I can extend the vectors to include all the common words (each word’s vector is then, for this sample, about 6000 long) and since and then do weights or 1/0, and compute those distances as well. To see what I mean let’s look at simplified case:

Say we have 13 “words”, A, B, …M

For word A, its synonyms are {C,F,G,K} and for word B its synonyms are {C, D, G, I, L}

So the vectors are:

A (0 0 1 0 0 1 1 0 0 0 1 0 0}

B {0 0 1 1 0 0 1 0 1 0 0 1 0}

Δ {0 0 0 1 0 1 0 0 1 0 1 1 0}

Now we might chose to ignore differences where vectors both have 0 and only count differences where one position in vector is 1, thus 1-1 (means A and B share a synonym) and 0-1 or 1-0 (means B has synonym A doesn’t have or A has synonym B doesn’t have). If the synonyms were identical then sum would be 0 and where synonyms are completely different sum would #syns(A) + #syns(B).

So you get the idea, various different methods of calculating differences. So which one is BEST? You’ll know when I know.

[Author note: A lot of “programmers” are trained only in programming (at worst) or computer science (at best). This means many programmers are untrained in general science, math and engineering. If you’re just writing code that the architect and/or the algorithm inventor defines, no problem. But if you’re really trying to solve a problem, not just how to code it, but how to get a program to get a useful result, you need to go beyond just programming knowledge. This is one reason I’m glad my original studies were in science (materials science) and then later business (with lots of emphasis on quantitative methods for finance). Originally programming was just recreation for me, as it is now, and later my profession (where I had to fill in a few gaps, via experience, an MSCS would know from education). But I’ve always found that “pure” programmers, without a broader background, are very limited in their problem solving abilities. It’s amazing how much an idea from one area of science, engineering, statistics (or other math) or business can supplement “pure” programming skill. So, kiddies, take more courses than just the compsci stuff (and esp. more than cookbook programming as many schools do). You’ll be better equipped for more senior jobs in software development.]

My apology to Microsoft

24 Tuesday Mar 2015

Posted by dmill96 in programming

≈ Leave a comment

Tags

application, progress

Yesterday I described the difficulty I was having just doing a build under Visual Studio and the time I’d wasted manually recreating my project to get around this problem. And yesterday I said I thought this problem would continue to occur and so I needed to restructure my code so it’s easier and quicker to recreate a VS project.

Sorry, Microsoft, for blaming you for this, even though you are the obvious choice. Even though I’d do some searches to attempt to uncover fixed for this problem I didn’t do something I learned, long ago, to try. Precisely enter the error message (even including the specific xxx.exe from my project) for a search. So today, very desperate, I figured I needed some other workaround. Previous searches yielded nothing, but the literal error message triggered a possible answer.

The villain is Avast, my anti-virus software (not what I would have ever thought of as the villain). Avast is doing something wrong, it’s simply being extremely over-protective about one program (who might be a virus) altered .exe’s (which, of course, is exactly what a build does). So Avast was either squirreling away the .exe or locking it somehow or something and poor old Visual Studio attempts to do something and is denied by an error coming up from the file system and thus can do nothing but tell me something is broken or missing.

It took a little fiddling how to get Avast to stop doing this (there is a way to exclude a folder and all its subfolders and files from virus scanning) and so now any project I build under Visual Studio’s default projects folder is protecting from Avast being over-eager.

So problem solved (I hope this is all, as sometimes happens, since I had multiple symptoms, there could be more than one problem). But now I’m off on a restructuring kick, as I described yesterday, now converting my quick-and-dirty personal programming approach to more what I would demand, of a professional team doing a product, as their architect. When I’m done (I hate the step backwards in order to take two steps forward stuff in programming) maybe I’ll just be in better shape anyway and so perhaps this problem and my misdiagnosis of it will be beneficial in the long run.

Comment on my programming

24 Tuesday Mar 2015

Posted by dmill96 in programming

≈ Leave a comment

Tags

application

I’ve kinda lost my momentum on this project in no small part to the fact my application blew up and I can’t fix the problem (which others have reported) and had to rebuild my application, an unpleasant and tedious process. The problem is that I use the “free” Visual Studio (Express) 2010 for my code. I’ve used Visual Studio for years professionally and am quite happy with it. In fact I’d have a very time giving up a really good IDE for the crappy tools (actually almost none) available for other “free” programming tools on Windows. But for some reason Visual Studio 2010 Express keeps making an unfixable mess of my app and I have to start over.

Now Visual Studio is a sophisticated and complex piece of software. It’s entirely reasonable that Microsoft wants to charge a lot of money for it for real developers. But they also decided to make a subset (perfectly fine for my purposes) available for free to individuals, nominally for “evaluation”. Now VS is not something you just go buy; it’s tied to the various categories of authorized developers in the MSFT developer ecosystem, again perfectly fine. But, and thank you MSFT, they did decide there are circumstances where individuals should also be able to program their own computers, not make products. BUT, I can’t help but believe the problems I’ve encountered are “deliberate”, perhaps unfixed bugs that cripple the Express version. Fine, but what a pain.

So after this has happened multiple times I now accept it as a consequence of getting something for nothing. Which means I need to be prepared to rebuild my application source, kinda from scratch, over and over. And the style of coding in my app is never very conducive to doing that, in a word, it is spaghetti code, poorly structured (although not horrible, just my laziness).

Now my working career was exactly the opposite of what I do individually. As the architect it was my job to make sure my programming team used best practices in code development. When you’re building a product that will also have a long lifetime and be supported by the usual comings-and-going of a team over time there are coding principles you must use.

But these are a pain, appropriate in product development, but tedious and slow in a personal project. So I cut corners, compared to what I know is the “right” way to build apps. Now actually, having been in software long enough and had many debates about methodology I’m engaged in one of the “agile” techniques (which I mostly don’t like) that is close to an endorsed approach called Extreme Programming. This was, frankly, the old-fashioned sloppy, pre-architecture, pre-design, and ad hoc era of the past, like 30 years ago. It was abandoned long ago because it produces code that is a mess, buggy, and hard to support. But it’s come back to life.

The idea, primarily endorsed in the Web world of instant changes, constant beta, never finished version, immediate release of iterations, is fairly simple. Programmers should sit with the “users” (product managers, even unsuspecting Web users) and instantly respond to requirements for change. Instead of the admittedly tedious process of “waterfall” development, where requirements and architecture and design are clarified before a line of code is written, XP says, basically, just go for it. Lots of iteration, half-baked ideas, get them in use, fix them later with refactoring or new functionality. Decades ago we called this code-and-go.

Since I am the architect, designer, product manager, user and programmer all rolled into one naturally XP’s basic concept fits. And so with a bit of thought it’s off the races, write some code, fiddle with it, incrementally change it and so forth. Frankly this produces a mess but it’s quick. And actually if you’re good at the process (which I flatter myself that I am) it’s gets a lot done, fairly quickly, with moderate effort.

UNTIL…

Something happens where the haphazard slapdash approach gets you in trouble, like having to rebuild your entire base of source code since your development tools break doing builds, for whatever reason (crippled deliberately to prevent “free” version from being used for real work or just sloppy bugs).

So it took me a full day, of trial and error, of tricks, of patching weird build errors, etc., before I finally had exactly the same code working, again, exactly the same way it worked before. I don’t want to do this again, but after three instances of having to do it I suspect I must.

Therefore I need to put my professional hat on and insist on some coding techniques, not just the tried-and-true best practices, but the additional requirement that I can recreate the code, cutting-and-pasting from backups of old source.

Now I already use classes approximately as they should be used. I don’t have long methods with many lines of complex code that I’ll never understand months after I wrote them in a fury of coding. But that’s not enough. Since VS normally is very flexible about where you put class code, i.e. which source file, since it’s all in the same namespace I don’t bother with a lot of structuring of the actual source files themselves. But, when I have to rebuild a complex (at least of personal scale) body of source code due to my tools failing, my structure makes this very hard.

And in doing this the last time I came up with a few guidelines (for instance, there is a strange connection in the automagical way VS works, vis-a-vis putting source codes in folders (convenient) and namespaces (that I’d never seen before). Normally the simple folder paradigm on any modern computer makes “versions” of the same app fairly simple, but not so with VS. Actual filenames and paths matter. Classes can’t be found, even though they’re right there is the same source file that is attempting to instantiate them (this was really bizarre).

So now I have to go back and take a fairly complex application and totally restructure it (not just simple refactoring) to get something I think I can rebuild easily the next time VS breaks it. UGH! Unproductive work and not something I expected to do.

When I was last working I was employed by a powerful player in the computer industry who had strong and high-level (CEO to CEO) connections to Microsoft. I had the highest level of support a programmer could ask for. I had the maximum fully paid licenses. My employer required me to work on their physical hardware (with some strange security stuff in it, I didn’t ask, just followed their rules) but I also worked at home. My licenses and contract with my employer permitted me to put licensed copies of Microsoft software on my own computers (I used several to do my job for complicated reasons). I didn’t violate anybody’s rules. When I retired I had to return all my employer’s property and licenses, which I did. But obsolete and out-of-date tools I retained on my computer (not completely clear whether this is OK or not, but no one seemed bothered by it). The IP of both my employer and MSFT was properly handled.

Fine, except I trashed my old computer and now, of course, have none of those licensed installers, up-to-date or obsolete. So I have to use the free software (which is the only choice other than arranging to be a registered developer to MSFT, which is not really what I’m doing, i.e. just dabbling with projects that interest me and will never be sold or used by anyone else). So my current computer is a mere shadow of the robust environment I used to have (I do miss VMWare, a very good way to do trial-and-error sysadmin where a mistake is easily reversed, unlike when you trash the registry on your own real machine).

So it’s tough enough being an individual developer without having additional burdens that really serve no purpose, to Microsoft, and just cause me grief. But beggars can’t be choosers so you do what you have to do. So rather than proceeding with the dozens of new things I want to do with my app I’ll need days of concentrated time to completely restructure it. Oh joy.

So a setback on reporting interestingly new results here, but I’ll get there. BTW, not even being able to run my app I can’t even work the data side of the street, updating my complex XML vocabulary or doing experiments as a user, not programmer. Well, the good news is that every day I’m doing this, instead of drills, I’m forgetting words AND my algorithms, which are designed for periods of use and then disuse and taking it back up again will get tested.

So there you have it.

New word: anhedonia (and also dysthymia)

15 Sunday Mar 2015

Posted by dmill96 in wod

≈ Leave a comment

Tags

vocabulary, WordOfTheDay

These are probably not words you think you need in your vocabulary but they may be more useful than you think. Almost everyone tosses around the word ‘depressed’ but rarely with its clinical meaning; in fact, despondent is probably more appropriate. And lots of people toss around the words, hedonism or hedonist, probably not quite accurately as well. Then, of course, there is pessimistic and a favorite I found for my vocabulary, weltschmerz (know that one, you’ll have to look it up). And we can toss around the somewhat old-fashioned word, melancholia. Or even dolorous can sneak its way in there. But these are not very “technical” words, more general descriptions of mood and state of mind, so what is anhedonia:

n. [psychiatry] inability to feel pleasure

n. [psychology] lack of pleasure or of the capacity to experience it

Dictionaries aren’t so great for this kind of thing so I also include the somewhat more useful Wikipedia entry on anhedonia. The dictionary definition is limited by what type of pleasure (gustatory, sexual, social, consumption, success), but WikePedia puts anhedonia more in the social realm (“characterized by social withdrawal and decreased pleasure in social situations”), although the general description is “inability to experience pleasure from activities usually found enjoyable, e.g. exercise, hobbies, music, sexual activities or social interactions”).

Not fun, eh?

But here’s the quiz part of this post. What about dysthymia? I’m not even going to include the dictionary definition, ah what the hell, let’s see:

persistent mild depression

Actually that’s fairly useless too, so the WikiPedia entry is more interesting.

Fairly hard to distinguish (hey, read any amount of psychology and psychiatry stuff and your head will be spinning, what a field for imprecision in their definitions of disorder – you think, tell me (accurately) what is neurotic or psychotic? No chance to pin those down).

Now you might wonder why I’m bothering with this. Two reasons, actually. First, I do think both these words can refer, at least on the surface, to a fairly large number of people (at least those a dysthymic wouldn’t call manic, which is what most other people seem to be, i.e. how could they possibly be happy about anything!, Life sucks!). Second, I was diagnosed with one of these (guess which one? well shrinks didn’t use anhedonic much a decade ago but dysthymic was popular). And what do shrinks do today (not much Freudian couch stuff) – hand out pills and anti-depressants are their cure-alls, despite the fact it’s still somewhat unclear whether they actually ever beat placebos (much psychopharmaceuticals have big issue with placebo effect, more so than other medical areas). Net-net. It didn’t work for me.

You know what did – getting on a plane and heading back to Bay Area. In the old days you used to exist the plane, at San Jose airport, directly onto the runway and then walk inside (remember old Terminal C). jet fuel and oil fumes and all, just smelling the air made me want to kiss the runway, like you see the soldiers do returning home. You know what else – climbing in the car with camping gear, every mile closer I get to the Bighorn Mountains in Wyoming, the spirits lift.

So maybe it’s not the ability to feel pleasure, maybe, tada, it’s just there is no pleasure in certain situations. Food and booze taste better in California and Wyoming, but it’s the same stuff, so it has to be what those environments represent. Again, maybe anhedonics are actually good at feeling pleasure, they’re just particular about what constitutes pleasure.

Anyway, have fun with the words.