Wednesday, April 3, 2013

FactoryGirl: how to fill tables with many-to-many relations

What is FactoryGirl?

If you are new in Ruby testing, this part is for you.

FactoryGirl is nice replacement for fixtures. Some reasons to use it:

  • FactoryGirl simplifies maintenance (when you add new field to database, you have to add it only to one place - appropriate factory).
  • FactoryGirl allows to generate different values to the same object.
  • You can think about objects and relations between them, not about IDs.

Official documentation (with good examples) here.

Fill tables with many-to-many relations

Tables structure

Documentations doesn't say it, but FactoryGirl allows to write custom code in before and after blocks. It's very convenient for filling many-to-many relations.

Imagine, there are 2 tables - teachers and subjects and relation many-to-many between them. Migration for creating these tables:

create_table :teachers do |t|
  t.string :name
end

create_table :subjects do |t|
  t.string :name
end

create_table :appointments do |t|
  t.references :teacher
  t.references :subject
end

Main idea

1. Declare in factory ignored field teachers, which not exists in class Subject, but can be passed from test:

factory :subject_with_teachers do
  ignore { teachers [] }
end
2. This field will be passed to evaluator.teachers to after action. Access it:
factory :subject_with_teachers do
  ignore { teachers [] }

  after(:create) do |subject, evaluator|
    evaluator.teachers.each do
      ...
    end
  end
end

Note: in after action we can write custom code, e.g. each cycle.

3. Pass this parameter from rspec test (or other test engine):
describe '...' do
  let(:teachers)       { FactoryGirl.create_list(:teacher, 3) }
  let(:subjects)       { FactoryGirl.create_list(:subject_with_teachers, 3, teachers: teachers) }
  ...
end

Full code

Factories

FactoryGirl.define do

  factory :teacher do
    sequence(:name) {|n| "name #{n}"}
  end

  factory :appointment, class: Appointment do
    teacher # should be passed by creator
    subject # should be passed by creator
  end

  factory :subject do
    sequence(:name) {|n| "name #{n}"}

    # inherits all subject fields
    factory :subject_with_teachers do

      # creates `field` 'teachers' with default value '[]'
      # caller can set this parameter when calls FactoryGirl.build or FactoryGirl.create
      ignore { teachers [] }

      after(:create) do |subject, evaluator|
        # subject contains only fields from factory
        # evaluator contains all fields, including ignored fields

        # here we can write any code, for example - each cycle
        puts "#{subject} has been created"
        evaluator.teachers.each do |teacher|
          FactoryGirl.create(:appointment, { teacher: teacher, subject: subject })
          puts "Appointment for #{teacher} and #{subject} has been created"
        end
      end

    end
  end
end

Rspec test

describe 'subjects with teachers' do
  let(:teachers)       { FactoryGirl.create_list(:teacher, 3) }
  let(:subjects)       { FactoryGirl.create_list(:subject_with_teachers, 3, teachers: teachers) }

  it 'should create 3 subjects, 3 teachers and 9 appointments between them' do
    subjects.length.should == 3
    subjects[0].teachers.should == teachers
    subjects[1].teachers.should == teachers
    subjects[2].teachers.should == teachers
  end
end

Monday, March 11, 2013

Postgress ORDER BY - how random?

Recently I ran into a surprising bug (a feature?) of Postgress SQL. It looks like under certain circumstances the order of rows in the result set of the SELECT statement can change between runs. In other words you run it once and then run it again, the order of records in these two result sets will be different.

To add some specifics: The SELECT statement in question has an ORDER BY clause. Ordering is done by a string field. The sort order has to be case insensitive, so that identical phrases sit next to each other regardless of the case. Obviously that if the result set contains several rows with the string field value differing only in string case, the order of these rows in the result set is arbitrary, which I totally expected and am completely fine with.

What I did not expect is that this order can change between 2 runs of the query.

It does not happen in a simple SELECT ... ORDER BY though. I ran into this feature/bug while working with paging using OFFSET/LIMIT. It so happened that in my result set I had 2 records: one with the sorting key 'Account' and another one with sorting key 'ACCOUNT'. In the result set without OFFSET/LIMIT clauses the 'Account' record as placed in position 130, while the 'ACCOUNT' record was sitting in position 131. This is fine. I do not care between the 2 which one comes first.

But when I started to read this result set in chunks of 10 something weird happened. The 'Account' record as expected was the last one in the result set of the statement with the OFFSET=120, LIMIT=10.

What was not expected is for it to also show up as the first record of the next chunk OFFSET=130, LIMIT=10. The proper owner of the position 131 - the 'ACCOUNT' record was nowhere to be found (in result sets). As a result of this problem one of my records was processed twice, while another one was skipped.

The workaround for this problem was simple - I appended the primary key to the sort field. This way I made the sorting key unique without altering the desired order of the records in the result set

Tuesday, March 5, 2013

Beware of JavaScript ghosts

Yesterday I was ambushed by a JavaScript ghost. Or conned? Not sure. Does not matter. I am just confused. Here's what happened. My function (an event handler) is passed an object:
function(result) {... 
Inside the function I can programmatically examine the object just fine. For instance I can retrieve the values of the object properties.
function(result) { 
    var s = result.status;
}
I can also introduce a variable and assign the object to be its value:
function(result) { 
    var s = result.status;
    var r = result;
}
Once it is done, I can access the property value off the object reference I assigned to the variable:
function(result) { 
    var s = result.status;
    var r = result;
    s = r.status;
}
So what exactly is the problem, the ambush, the con? Here... The same expression(r.status;) executed outside the function gives a different result. Instead of let us say 500 you will get undefined. More than that, if you examine the value of r itself, while inside it shows up as an object passed as the argument, once outside it becomes undefined.

The variable looses its value as it is passed out of scope!!!

You do not believe this is possible? I would not either, but do not take my word for it. Check for yourself.

Ok, Ok... I cheated - just a little. This is not just any object - this is an XHR returned by JQuery for an error. Also JavaScript it is not, not completely. As you can see in the fiddle, the failing expression is not a JavaScript expression but rather an angular expression. They just look the same and I would expect them to work the same way. Confusing...

Thursday, February 28, 2013

Would you like some CoffeScript?

I am not sure.

I grew up on white space agnostic languages - C, C#, Java, and, yes, JavaScript among others. Languages attaching significance to tabs and spaces took me some using to, but I learned to appreciate the clarity and conciseness of languages like F# and Ruby.

Now, CoffeScript, well... I would really love to use it instead of the JavaScript. The syntax seems to be friendlier - more concise with much  less noise (brackets, keywords etc.) necessary to express the same concepts.


CoffeScript

JavaScript
# Function:
square = (x) -> x * x
// Function
square = function(x) {
  return x * x;
};
# Object:
kids =
  brother:
    name: "Max"
    age:  11
  sister:
    name: "Ida"
    age:  9
// Object:
kids = {
  brother: {
    name: "Max",
    age: 11
  },
  sister: {
    name: "Ida",
    age: 9
  }
};
# Loop:
eat food for food in ['toast', 'cheese', 'wine']
// Loop:
var food, _i, _len, _ref;

_ref = ['toast', 'cheese', 'wine'];
for (_i = 0, _len = _ref.length; _i < _len; _i++) {
  food = _ref[_i];
  eat(food);
}

The examples above are from the official CoffeeScript web site and they look great - right? Brackets are gone, commas, semicolons... Throw in variable scoping, classes, and a bunch of smaller perks like Ruby style string interpolation, and there you have it - a strong case in favor of using CoffeScript everywhere. This is especially true for coding for Angular or other frameworks with similar style, where, for instance, nested anonymous functions occur on every other line of code.

Unfortunately this is not the end of the story. What gives me pause is that the compiler (the language) seems to be too finicky. Some expressions compile as expected, while others, seemingly very similar do not:

# This one compiles:
foo param,->
// as expected:
foo(param, function() {});
# This one does not:
foo ->,param
// ERROR
# This one is good:
variable1 - variable2 -
        variable3 - variable4
// as expected:
variable1 - variable2 - variable3 - variable4;
# This one not so much:
variable1 - variable2
        - variable3 - variable4
// ERROR

Combine it with the less than perfect diagnostics and you can spend more than a few minutes figuring out why after you copy-pasted an existing function the code would not even compile. And here is another twist. Sometimes a whitespace in a wrong place can produce code which compiles into something completely unexpected. Here is an example:

# This one:
foo(bar) param
// compiles to:
foo(bar)(param);
# with one extra space:
foo (bar) param
// it compiles to:
foo(bar(param));

Of course part of this is just the learning curve and will go away after some time. You can also make an argument that quirks like the one above should be caught by unit testing. All true, but I am still uneasy... Should I keep using JavaScript or should I switch to Coffee?

So... what is your poison? Today's specials are:

Verbosity of the older syntax with JavaScript

or

Hypersensitivity to spaces and Syntax traps with CoffeeScript

Which one would do you feel better about?

Me - I am not sure.

Saturday, February 2, 2013

JQuery: the Angular way - Selectmenu


Yesterday I was presented with a challenge: how to use the selectmenu plugin by Felix Nagel in an angularjs application.
The angular way with jQuery plugins is to create an angular directive wrapping this plugin and then apply this directive to the tag this plugin should be applied to. Something to the effect of:


<select select-menu>
   <option value="...">...</option>
</select>

for the HTML and

.directive(

  "selectMenu"
  [ '$log',
    (console) ->
      {
        link: (scope, element, attrs, ctrl) ->
            $(element).selectmenu()
      }
    ])


for the directive definition in CoffeScript. For plugins relaying just on the HTML tag they are attached to, this approach works just fine. It will work fine for the selectmenu plugin as well as long as the html, including the nested option tags is static.

But as soon as you try to the list of option dynamic, like:

<select select-menu>
   <option ng-repeat="option in options" value="...">...</option>
</select>

it no longer works, the list of options will show up empty. The reason is that the selectmenu method is called too early. The ng-repeat is yet to do its magic and the real list of options object does not exist yet. 

The solution to this problem is simple: invite the Option tag to the party. Create another directive and apply it to every option. When this directive is linked it should modify the parent select to reflect that a new option just has been added: 

.directive(
  "selectOption"
  [ '$log'
    (console) ->
      {
        link: (scope, element, attrs, ctrl) ->
          $(scope.select).selectmenu() 
      }
    ])

The way it works is that right after the new option tag is added to the DOM, the link method is called and the selectmenu plugin documentation recoomends calling selectmenu method again as a way to update the list of options.

This would be the end of the story, but there was one more thing. It turns out that it is still too early. The option tag has been added but the value of the text is not evaluated yet. I had to do it myself using the $interpolate function. This was the last touch. You can see the result here

Thursday, January 31, 2013

How to: get nice git history

How to: get nice git history

Problem

There are 3 developers. Each created 1 commit locally.

  • developer 1:
    • git push origin master - OK
  • developer 2:
    • git push origin master - FAIL. 
    • git fetch origin
    • git merge origin master - creates new branch and new commit *
    • git push
  • developer 3:
    • git push origin master - FAIL. 
    • git fetch origin
    • git merge origin master - creates new branch and new commit *
    • git push
*Note: git pull origin master == git fetch origin && git merge origin master

So, result in git history:

By this reason history in our projects often looks like



How to fix it?

Everyone should use git up instead of git pull

To add this useful alias execute:

For Linux/Mac OS
git config --global alias.up '!(git add . && git stash && git pull --rebase >&2) | grep -v "No local changes to save" && git stash pop'

For Windows - UPD
git config --global alias.up 'git pull --rebase'
This command is not so powerful, but works on Windows


It uses git pull --rebase instead of git pull

More details about difference between git pull and git pull --rebase at this link: http://habrahabr.ru/post/161009/ (There is main part of article below - sometimes sites die).


Git Rebase: руководство по использованию tutorial


Rebase — один из двух способов объединить изменения, сделанные в одной ветке, с другой веткой. Начинающие и даже опытные пользователи git иногда испытывают нежелание пользоваться ей, так как не видят смысла осваивать еще один способ объединять изменения, когда уже и так прекрасно владеют операцией merge. В этой статье я бы хотел подробно разобрать теорию и практику использования rebase.

Теория


Итак, освежим теоретические знания о том, что же такое rebase. Для начала вкратце — у вас есть две ветки — master и feature, обе локальные, feature была создана от master в состоянии A и содержит в себе коммиты C, D и E. В ветку master после отделения от нее ветки feature был сделан 1 коммит B.



После применения операции rebase master в ветке feature, дерево коммитов будет иметь вид:



Обратите внимание, что коммиты C', D' и E' — не равны C, D и E, они имеют другие хеши, но изменения (дельты), которые они в себе несут, в идеале точно такие же. Отличие в коммитах обусловлено тем, что они имеют другую базу (в первом случае — A, во втором — B), отличия в дельтах, если они есть, обусловлены разрешением конфликтных ситуаций, возникших при rebase. Об этом чуть подробнее далее.

Такое состояние имеет одно важное преимущество перед первым, при слиянии ветки feature в master ветка может быть объединена по fast-forward, что исключает возникновение конфликтов при выполнении этой операции, кроме того, код в ветке feature более актуален, так как учитывает изменения сделанные в ветке master в коммите B.

Процесс rebase-а детально


Давайте теперь разберемся с механикой этого процесса, как именно дерево 1 превратилось в дерево 2?

Напомню, перед rebase вы находтесь в ветке feature, то есть ваш HEAD смотрит на указатель feature, который в свою очередь смотрит на коммит E. Идентификатор ветки master вы передаете в команду как аргумент:

git rebase master

Для начала git находит базовый коммит — общий родитель этих двух состояний. В данном случае это коммит A. Далее двигаясь в направлении вашего текущего HEAD git вычисляет разницу для каждой пары коммитов, на первом шаге между A и С, назовем ее ΔAC. Эта дельта применяется к текущему состоянию ветки master. Если при этом не возникает конфликтное состояние, создается коммит C', таким образом C' = B + ΔAC. Ветки master и feature при этом не смещаются, однако, HEAD перемещается на новый коммит (C'), приводя ваш репозитарий состояние «отделеной головы» (detached HEAD).


Успешно создав коммит C', git переходит к переносу следующих изменений — ΔCD. Предположим, что при наложении этих изменний на коммит C' возник конфликт. Процесс rebase останавливается (именно в этот момент, набрав git status вы можете обнаружить, что находитесь в состоянии detached HEAD). Изменения, внесенные ΔCD находятся в вашей рабочей копии и (кроме конфликтных) подготовлены к коммиту (пунктиром обозначена stage-зона):


Далее вы можете предпринять следующие шаги:

1. Отменить процесс rebase набрав в консоли

git rebase --abort

При этом маркер HEAD, будет перенесен обратно на ветку feature, а уже добавленные коммиты повиснут в воздухе (на них не будет указывать ни один указатель) и будут вскоре удалены.

2. Разрешить конфликт в вашем любимом merge-tool'е, подготовить файлы к коммиту, набрав git add %filename%. Проделав это со всеми конфликтными файлами, продолжить процесс rebase-а набрав в консоли

git rebase --continue

При этом, если все конфликты действительно разрешены, будет создан коммит D' и rebase перейдет к следующему, в данном примере последнему шагу.

3. Если изменения, сделанные при формировании коммита B и коммита D являются полностью взаимоисключающими, причем «правильные» изменения сделаны в коммите B, то вы не сможете продолжить набрав git rebase --continue, так как разрешив конфликты обнаружите, что изменений в рабочей копии нет. В данном случае вам надо пропустить создание коммита D', набрав команду

git rebase --skip

После применения изменений ΔDE будет создан последний коммит E', указатель ветки feature будет установлен на коммит E', а HEAD станет показывать на ветку feature — теперь, вы находитесь в состоянии на втором рисунке, rebase окончен. Старые коммиты C, D и E вам больше не нужны.


При этом коммиты, созданные в процессе rebase-а, будут содержать данные как об оригинальном авторе и дате изменений (Author), так и о пользователе, сделавшем rebase (Commiter):

commit 0244215614ce6886c9e7d75755601f94b8e19729
Author:     sloot69 <***@****.com>
AuthorDate: Mon Nov 26 13:19:08 2012 +0400
Commit:     Alex <***@****.com>
CommitDate: Mon Nov 26 13:33:27 2012 +0400


С небес на землю — rebase в реальных условиях


На самом деле обычно вы работаете не с двумя ветками, а с четырьмя в самом простом случае: master, origin/master, feature и origin/feature. При этом rebase возможен как между веткой и ее origin-ом, например feature и origin/feature, так и между локальными ветками feature и master.

Rebase ветки с origin-ом


Если вы хотите начать работать с rebase, то лучше всего начать с ребейза своих изменений в ветке относительно ее копии в удаленном репозитарии. Дело в том, что когда вы добавляете коммит, и в удаленном репозитарии добавляется коммит, для объединения изменений по-умолчанию используется merge. Выглядит это примерно так:



Представим умозрительную ситуацию — 3 разработчика активно работают с веткой master в удаленном репозитарии. Делая одновременно комиты на своих машинах они отправляют каждый по 1 изменению в ветку. При этом первый отправляет их без проблем. Второй и третий сталкивается с тем что ветка не может быть отправлена операцией git push origin master, так как в ней уже есть изменения, которые не синхронизированы на локальные машины разработчиков. Оба разработчика (2 и 3) делают git pull origin master, создавая при этом локальные merge-коммиты у себя в репозитарии. Второй делает git push первым. Третий при попытке отправить изменения снова сталкивается с обновлением удаленной ветки и снова делает git pull, создавая еще один merge-коммит. И наконец, третий разработчик делает успешныйgit push origin master. Если удаленный репозитарий расположен например на github, то network, то есть граф коммитов будет иметь следующий вид:

Три коммита превратились в 6 (базовый коммит не считаем), история изменений неоправдано запутана, информация об объединении локальных веток с удаленными, на мой взгляд, лишняя. Если масштабировать эту ситуацию на несколько тематических веток и большее количество разработчиков, граф может выглядеть, например, так:



Анализ изменений в таком графе неоправданно трудоемкое занятие. Как тут может помочь rebase?
Если вместо git pull origin master выполнить git pull --rebase origin master, то ваши локальные изменения, подобно коммитам C, D и E из первого примера будут перенесены наверх обновленного состояния ветки origin/master, позволяя без создания дополнительных коммитов передать их на удаленный сервер с помощью git push origin master. То есть слияние изменений будет выглядеть уже так:



Как видно, «лишних» merge-коммитов создано не было, ваши изменения в коммите C были использованы для создания коммита C'. Сам же коммит C остался у вас в локальном репозитарии, в удаленный репозитарий был передан только C'. Если все программисты в команде возьмут за правило пользоваться git pull --rebase, тогда каждая из веток в удаленном репозитарии будет выглядеть линейно.

Wednesday, January 23, 2013

Embedding Images into HTML Pages


Just another trick used for the offline page mentioned in this post.  The page should display a number of images (like jQuery UI theme icons) and they have to be completely available after the page is downloaded.  If you save the page from the browser menu it creates the additional folder named MyCoolHtmlPage_files (assuming the page is named MyCoolHtmlPage.html) and places all downloadable from the same server resources there.  JavaScript, CSS files, images - all of them go into that folder.  But I think for downloadable page it is not a good idea to give our users the page itself and all needed support files separately.  So we need to embed all the resources into the page.  It is no problem to embed JavaScript and CSS files as they are just a plain text.  Images are less friendly.

Fortunately all modern browsers support the data URI scheme which allows us to embed the actual image content into the CSS resource URL.  For one of embedded jQuery UI backgrounds this will look like:

.ui-widget-header {
    background-image: url('
hEUgAAAAEAAABkCAYAAABHLFpgAAAALElEQVQYlWN49OjRfyYGBgaGIUT8//8fSqBx0
Yh///4RL8vAwAAVQ2MNOwIAl6g6KkOJwk8AAAAASUVORK5CYII=');
}

Yeah, some effort is needed to convert all used images into Base-64 encoded strings and write appropriate styles to override standard ones but I think it is minor (I just used an online encoder).

Note: Not all browsers support newlines in the url() so it is a good idea to place embedded data on the single line.  I have added newlines to my example just for readability (if the word "readability" is applicable to the Base-64 encoded binary data).

X-Post from: http://dzorin68.blogspot.com/2013/01/embedding-images-into-html-page.html

Looking for JavaScript Decompressor for Data Compressed by C#


It happened that I needed to make an HTML page.  Sometimes such things happen, you know.  That page was required to be completely offline, just a snapshot of some piece of data being downloaded by users for further analysis.  At the same time that offline page was required to be as interactive as the online page with the same data.

The online page for that data contained grid with "master" rows and upon clicking on the row it presented a dialog box with "details" loaded from the server by AJAX request.  The offline page could not use AJAX so it should contain all needed data, both "master" part and all the "details", inside it.  The average amount of data used on these pages was not very small - about 3 megabytes in JSON format, so it was decided to compress that data on the server side and decompress needed pieces on the fly when they should come into the dialog.

The code for the server side was not very complex.  As the web application uses ASP.NET MVC and is written on C# I could use standard .NET classes for compression like either System.IO.Compression.DeflateStream or System.IO.Compression.GZipStream:


var details = new Dictionary<stringstring>();
foreach (DataRow row in detailsData.Rows)
{
    var text = Convert.ToString(row["details_content"]);
    var bytes = Encoding.UTF8.GetBytes(text);
    using (var ms = new MemoryStream())
    {
        using (var zipStream = new DeflateStream(ms, CompressionMode.Compress))
        {
            zipStream.Write(bytes, 0, bytes.Length);
            zipStream.Flush();
        }
        details[Convert.ToString(row["master_id"])] = Convert.ToBase64String(ms.ToArray());
    }
}
viewModel.ZippedResultsJson = EncoderHelper.SerializeToJsonString(details);

But what to do on the page?  Are there any JavaScript implementations of, say, inflaters?  Initially I thought that these days there are lots of them as probably almost all things are already implemented in JS.  But when I tried to look for them I've found that things are not that good.  Actually there are just a few implementations of deflate/inflate algorithms and not all of them look working.  Thanks StackOverflow, they know answers to almost all our questions and I've found lots of answers how to get some of JS decompressors to work with Java or Python server-side compression.

But just a pair of questions about C#-JS were answered like "just use the browser gzip support", i.e. encode the whole response.  This is not my case as I don't have response so I tried to use some of implementations to see which of them is more usable.  Things were worse than I expected, no one of relatively small inflate implementations worked with C#'s DeflateStream-compressed data (although they worked quite fine with data compressed by themselves).

So I decided to try gzip instead of the "raw" deflate.  And I've found the only JS implementation of gzip decompressor on http://jsxgraph.uni-bayreuth.de/wp/2009/09/29/jsxcompressor-zlib-compressed-javascript-code/.  I tried it and it worked!  (Of course, I've changed the DeflateStream to the GZipStream on the server side).  Lots of thanks to those guys who created that really working tool!  With it all I needed on the page was just:

$('#details-div').html(JXG.decompress(detailsData[masterId]));

And that's it!  Simple, usable, great!  And it does Base-64 decoding inside so there is no need to use separate decoding tools or libraries.  Great job guys!  Now I know the right tool for this task.

X-Post from: http://dzorin68.blogspot.com/2013/01/looking-for-javascript-decompressor-for.html