I started this morning hoping to sort out the issue whereby spans were not being split up correctly. First, I realised that this website isn't backed up at all, so I set up a script to sort that out. Then I got back to work. The issue is that matches are currently split into contiguous spans based only on whether the input words they match are contiguous. But what if the input words are contiguous, but the matched words in the example are not? We need to split this into multiple spans. Further, what if the words are contiguous in both the input and the example, but the ordering is changed?
james's blog
Dependency Matching
There was a problem with the dependency matching algorithm which resulted in some bad matches ranking more highly than they should. The algorithm allowed a single dependency in the input to match several dependencies in an example, provided they all had the same lemmas and relationships.
Switch back to Moses alignments
Having abandoned the idea of using the GIZA++ alignment scores, I can now switch back to the more reliable Moses alignments (a smart combination of both GIZA++ alignment directions). Having done so, the BLEU score jumped about 3 points:
r157-20080731053726: BLEU = 20.05, 52.9/27.1/16.8/11.0 (BP=0.885, ratio=0.891, hyp_len=4809, ref_len=5398)
r169-20080805193448: BLEU = 23.21, 52.3/28.2/18.3/12.4 (BP=0.966, ratio=0.967, hyp_len=5219, ref_len=5398)
Repeated words in output
I thought I'd fixed the problem of target words appearing more than once in the output, but I hadn't done enough. While I had prevented a word from appearing twice in one phrase, it could still get into the output twice by being aligned to words in two separate but <linked> spans. Fixing this required a bit of restructuring, as generation of target phrases needed to be more aware of what was going on elsewhere.
Script updates
Updated do_everything.sh:
- includes support for running EBMT only or SMT only
- can use an existing results dir as a base when running SMT only
- cleaned up params file
Updated make_alignment_images.pl:
- includes support for Moses alignment files
- includes basic range support (-from and -to)
Debug Threads
I modified the EBMT system so that when in debug mode it doesn't hang around waiting for diagrams to be created. Before the change, CPU utilisation was down around 40% for the java process, because it would wait for deps2dot.pl and dot to run before resuming. I've changed it so that diagrams are created in their own thread, which are spawned and then forgotten about. Now the java process is consistently running at 100%, with about six or seven other threads creating images simultaneously, using up roughly another 100% CPU.
Per sentence BLEU scores
New script, sentence-bleu-scores.pl, generates an HTML page which displays per sentence BLEU scores for several systems, and their difference from the base system score (i.e. Moses). Can be sorted on any column using JavaScript. SHould help identify problem sentences.
Removed GIZA scores
Tried running the system with the GIZA++ alignment 'probabilities' removed. BLEU score barely changed. Changing XML mode from exclusive to inclusive gave a big performance jump though. Looks like I need to play around with the probability parameter some more.
Do Everything
Finally created a script that pulls the whole compilation-ebmt-smt-bleu process together into one easy package. do_everything.sh grabs the latest revision from subversion, compiles it, runs the EBMT stage, then feeds it through Moses, and finally scores the output using BLEU. All of the output is stored tidily in a new folder with details of the software revision and all script parameters.
GIZA++ Structure
Just a quick one to keep track of what I know about how GIZA++ works.
Files:
hmm.{h,cc}: defines and creates HMM network
model1.{h,cc}: defines and initialises tTable (translation table?)
TTables.{h,cc}: template used for tTable