Discussion:
[Revctrl] updates to my notes on revision control
zooko
2008-02-25 17:04:44 UTC
Permalink
Folks:

My brother asked me for advice on choosing a revision control tool,
so that prompted me to update this old quick ref for the first time
in almost three years:

https://zooko.com/revision_control_quick_ref.html

I also updated my "badmerge" notes:

https://zooko.com/badmerge/simple.html

Along the way I realized that the argument that Linus Torvalds
(citation sadly lost) and others have advanced to defend merge
algorithms which fail to track patch location also defend merge
algorithms which arbitrary change characters in patches.

Regards,

Zooko

P.S. If anyone can help me find the LKML thread from maybe two years
ago in which git was silently deleting (or perhaps resurrecting) a
file when an apparently-unrelated changeset was merged, I would
appreciate it. It was a post in this thread, by Linus Torvalds, that
I wanted to reference. He stated that, while they would certainly
tweak git's heuristics to avoid this particular case, that people
shouldn't take this too far by hypothesizing that it were possible to
do better in general.
Michael Richters
2008-02-25 21:18:44 UTC
Permalink
I just had a look at your "badmerge"
(https://zooko.com/badmerge/simple.html) scenarios for the first time,
and I want to point out that in your argument, you are making an
assumption which is not necessarily correct. Though I do believe that
it is good for the merge algorithm to do its best to track the
identity of each unit of content, this is not theoretically possible.
Using a simplified notation for your example, if I make the following
changes:

A -> BA -> ABA

...the version control system must know what my intent was in order to
merge the following:

A -> X

If, in the second change on my branch (BA -> ABA), I simply added an
"A" at the beginning, the darcs merge algorithm will be correct in
producing "ABX", but if I first moved "A" to the beginning, then added
another "A" at the end, the darcs merge algorithm will be incorrect.
Also, if I first deleted the "A", then added two new ones which only
coincidentally were the same as the first, it's not clear what the
"correct" merge would be (maybe it should be "ABA" or "XBX"). I would
argue that its guess is better than the others, but it is still a
guess, unless it can read my mind.

If the version control system was integrated with my text editor, and
knew more of the history, it could make an even better guess, but even
then, when I commit something to a version control system, all I see
is the current state of the content, not the whole path of how it got
there (nor would I want to). Unless the version control system stops
me at commit time (or, less "perfectly", merge time) to ask which "A"
(or neither or both) corresponds to the original "A", can we get the
correct results. Of course, this would remove most of the automation
that's the whole purpose of the merge algorithm.

I submit that there is no merge algorithm that cannot be defeated by
both clever and stupid users, even if the version control system
forces the users to declare the intent of their changes in a draconian
(and cumbersome) fashion.


--Mike
Walter Franzini
2008-02-25 21:47:12 UTC
Permalink
Post by zooko
My brother asked me for advice on choosing a revision control tool,
so that prompted me to update this old quick ref for the first time
https://zooko.com/revision_control_quick_ref.html
Hi,

I think should toggle the decentralized flag for Aegis.

It's not a widely known/used feature but Aegis has it.
More info available at

http://mysite.verizon.net/ralph.a.smith1/aegis/refman-html/aedist-1.html

http://mysite.verizon.net/ralph.a.smith1/aegis/user-guide-html/ug-c10_0-geographically_distributed_development.html#id2608103

and

http://aegis.stepbuild.org/aegis-talk.pdf

ciao
--
Walter Franzini
http://aegis.stepbuild.org/

PGP Public key ID: 1024D/CB3FEB43
Key fingerprint : FA26 C33B CAFF 7848 EFEB 7327 96AA 2D57 CB3F EB43
Key server : http://www.keyserver.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080225/6bf45897/attachment.pgp
Thomas Lord
2008-02-26 03:51:32 UTC
Permalink
Post by zooko
https://zooko.com/badmerge/simple.html
The note ("badmerge") seems to lead off with a conceptual error, to me.

You say that the Darcs algorithm is not making simply
a "better guess" but is using knowledge of actual code
motion.

That's not strictly true. Darcs is making use of
heuristically estimated code motion. So Darcs merge
solution is, actually, still a guess -- just a different kind
of guess.

There are pathological cases for all merge operations.
That's because we don't have (and probably can't have)
a really perfect language for describing a "change" to
a program as it relates to the pursuit of the goals of writing
a program. The activity of merging is always in pursuit
of the goals of writing the program in question but the
activity of merging tangibly deals with "changes" as well as we can manage
to describe them. The two sets of meanings ("goals in writing" vs.
"changes-as-in-merge-slinging") don't enjoy any obvious
isomorphism and for deep reasons probably won't ever. So,
merge tools are always, pretty much by definition, tools designed
to "make guesses".

One measure of quality in a revision control system
is how easy it is to add new merge tools to the toolbox.
Is meta-data about history maintained in such a way that new
tools are easy to add and can find the history info they need?
Arch is unsurpassed in this area, I think.

-t
Marnix Klooster
2008-02-26 05:22:32 UTC
Permalink
Post by Thomas Lord
The note ("badmerge") seems to lead off with a conceptual error, to me.
You say that the Darcs algorithm is not making simply
a "better guess" but is using knowledge of actual code
motion.
That's not strictly true. Darcs is making use of
heuristically estimated code motion. So Darcs merge
solution is, actually, still a guess -- just a different kind
of guess.
There are pathological cases for all merge operations.
That's because we don't have (and probably can't have)
a really perfect language for describing a "change" to
a program as it relates to the pursuit of the goals of writing
a program.
Hi Thomas,

Could you clarify this for me? I know darcs a bit, and I'm not sure
where darcs is guessing when merging. Darcs *is* guessing when creating
a patch-- just like 'diff' it sometimes find differences like

}

void g(void)
{
printf("g!\n");

if e.g. a new C function is added.

However, the result of this is a darcs patch thay says "add these 5
lines [see above] inbetween the lines that are currently 12th and 13th
in the file that is currently called src/fgh.c".

When that patch is merged into a repository that already has an
additional patch which inserts 3 lines at the top of fgh.c, then the
first patch is changed to "add these 5 lines [see above] inbetween the
lines that are currently 15th and 16th in the file that is currently
called src/fgh.c". And that 'commuted' patch applies cleanly to that
new repository, without looking at the actual file contents at all.

No guessing involved here.

Or did you mean something else?

Groetjes,
<><
Marnix
William Uther
2008-02-26 06:03:57 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Thomas Lord
The note ("badmerge") seems to lead off with a conceptual error, to me.
You say that the Darcs algorithm is not making simply
a "better guess" but is using knowledge of actual code
motion.
That's not strictly true. Darcs is making use of
heuristically estimated code motion. So Darcs merge
solution is, actually, still a guess -- just a different kind
of guess.
Could you clarify this for me? I know darcs a bit, and I'm not sure
where darcs is guessing when merging. Darcs *is* guessing when creating
a patch--
Hi,
Tom didn't say DARCS was guessing when merging. He said that the
DARCS
merge *solution* is still based on guessing. This is true for exactly
the
reason you identified - the initial patch is a guess. The merge
algorithm
suffers from "Garbage In, Garbage Out", or rather "Guess In, Guess Out".

If DARCS had a way for the user to check that the patch was 'correct'
before it was committed, then there would be no guessing in the
solution.
In that case the user could catch the example you gave of a bad diff.

I think it is also true that having clean, formal semantics does not
always mean a system does what you want. Even if DARCS provided a UI
for patch verification, and hence wasn't guessing, it could still get
things wrong at a semantic level because the patch language it uses
cannot express all the code change concepts we use with 100% fidelity.

Cheers,

Will :-}
zooko
2008-02-26 15:05:36 UTC
Permalink
Folks:

Right -- I generally agree with what Michael Richters, Walter
Franzini, Tom Lord, Marnix Klooster, and William Uther have posted in
this thread.

I like to break down the parts of the revision control algorithm like
this:

There is "resolution" (terminology chosen by the Codeville guys),
which is the process of looking at two snapshots of the filesystem
and inferring what sort of change the user meant. This is the part
where the hunks can be "not what the user intended" while still being
syntactically right, as Marnix showed:

}

void g(void)
{
printf("g!\n");

(As an example of how the addition of a new C function named 'g'
might be resolved.)

Then there is "location", which is the process of determining, when
applying a given patch, where to apply it. There are at least two
kinds of location question -- "Which file does this patch apply
to?" (Tom Lord's original "which Makefile?" problem) and "Which lines
does this hunk apply to?". Some tools, such as arch, can track
"which file" precisely, but use a 3-merge for "which lines", which
means they cannot answer the "which lines" question without
guessing. Others, such as darcs, track 'which lines" precisely.


What I am currently interested in is separations between these
concepts. Here are a few assertions:

1. Correctly answering the location question is not sufficient for
applying a patch "in the way that the user intended" because of the
possibility of the resolution being unintended by the programmer.
Tom Lord and William Uther pointed this out on this thread.

2. Correctly answering both the location question and the resolution
question would still not be sufficient for "merge produces a better C
program".

3. Incorrectly answering some of these questions can sometimes
accidentally lead to better C programs, as per the example at [1].
Therefore, correctly answering the resolution question is not
necessary for "merge produces a better C program", and correctly
answering the location question is not necessary for "merge produces
a better C program".

3.b. Arbitrarily swapping characters for other characters can also
sometimes accidentally lead to better C programs, as per my argument
in [1].


Obviously I, and everyone else, know that "merge produces a better C
program" is not a property that we are going to directly achieve by
tweaking our resolution and location algorithms. It is a very
important property, and we should think about how to facilitate users
to achieve it, but it arises only from the interaction of algorithm
and user -- not from algorithm alone.


Oh, and here's another one that is going to help me think more
clearly about these things in the future:

4. Correctly "applying patches in the way the user intended" is not
sufficient for "merge produces a better C program", because the
intents of the two users may conflict.


Okay, here's an open question. Is assertion 5 right?

uncertain assertion 5. Correctly answering both the resolution and
location questions is necessary and sufficient for "applying the
patch the way the user intended".


Regards,

Zooko

[1] https://zooko.com/badmerge/concrete-bad-semantics.html
Michael Richters
2008-02-26 18:25:46 UTC
Permalink
Post by zooko
Okay, here's an open question. Is assertion 5 right?
uncertain assertion 5. Correctly answering both the resolution and
location questions is necessary and sufficient for "applying the
patch the way the user intended".
I think this assertion is correct, but I will repeat my earlier point:
it is not possible for the computer to answer the location question
correctly with complete confidence. And this is debatable, but it may
not be possible for the user performing the merge to do so, either.

Back to the example:

branch 1 (user 1): A -> BA -> ABA
branch 2 (user 2): A -> X

If branch 1 is being merged into branch 2 by user 2, he doesn't know
user 1's intentions any more than the computer does. There are
several possible "correct" results from the merge:

XBA, ABX, XBX, and even ABA

Which one is "correct" depends on the intermediate, uncommitted
changes made by user 1, such as:

A -> BA -> ABA => ABX

A -> BA -> ( AB ) -> ABA => XBA

A -> BA -> ( ABA where both A's are meant to be the same ) -> ABA => XBX

A -> BA -> ( B -> ABA where A is coincidentally the same as the
original ) -> ABA => ABA

(The '=>' represents the merge with branch 2)

Which merge is "correct", based on the changes made by user 1 is
fundamentally unknowable, because he didn't commit all of those
intermediate changes (and if the system did all that bookkeeping it
would be both cumbersome to use, and frequently incorrect, because
users would probably not follow the correct procedure in every case).

You may consider my last two examples (XBX & ABA) to be invalid, and
you could certainly make that argument, but that is really a matter of
opinion. Maybe 'A' is a header of some sort, which should repeat
verbatim in the file, and contains a typo. Logically, if 'A -> X' is
a change that fixes that typo, it would make sense for the merge
result to be 'XBX'.

The last example is weaker, but it still could have been user 1's
intent to remove 'A' from the file, and later add the same contents.
Maybe this is just a case of user error; he should have committed the
deletion in order to communicate that intent to the version control
system.

I guess I'm making an argument for users committing changes frequently
in order to give the version control system the best chance of
understanding their intent, but since the system can't control the
users, it will always have to make the best guess it can, or prompt
the user doing the merge (and/or the commit) copiously for
information.

--Mike
Thomas Lord
2008-02-26 21:46:53 UTC
Permalink
Post by zooko
Some tools, such as arch, can track
"which file" precisely, but use a 3-merge for "which lines", which
means they cannot answer the "which lines" question without
guessing. Others, such as darcs, track 'which lines" precisely.
I don't why you think Arch "cannot" do what Darcs does.
If you really wanted Darcs-style merging for Arch, you
could probably implement it without modifying Arch at
all in a mix of sh(1) and awk(1) -- at least to get it started.
Arch's "database" conveniently contains all of the information
you need to run the Darcs merge algorithm.

Also: Arch *has* a 3-way merge built-in but it also has other
kinds of merge and, as I suggest above, you can add more if you
like without having to modify Arch itself at all.
Post by zooko
Obviously I, and everyone else, know that "merge produces a better C
program" is not a property that we are going to directly achieve by
tweaking our resolution and location algorithms. It is a very
important property, and we should think about how to facilitate users
to achieve it, but it arises only from the interaction of algorithm
and user -- not from algorithm alone.
That's all crazy talk, imo. The purpose of a merge algorithm
is to help make an integration process more efficient. Automatically
changing lots of lines in many files can speed up editing, but
the output shouldn't be a "better C program" but "a better state
of the source tree from an integrator's perspective".

For example, here is a merge feature I'd like to see:

Let's let patches be optionally *signed*.

Let every source tree contain a policy file, mapping
files and directories in the tree to various merging
policies and public keys. The policies might say
"only certain signers can modify certain files -- if
others try to, treat that as a merge conflict." Or,
the policies might say: "Permit any merge to modify
the file scheduler.c, but put conflict markers around
every change made, even if there is no conflict" (thus
forcing a by-hand review of changes to scheduler.c).

Those examples illustrate that the quality of a merge tool,
for some users at least, may not have as much to do with
the semantics-in-C of the output of the merge.
Post by zooko
uncertain assertion 5. Correctly answering both the resolution and
location questions is necessary and sufficient for "applying the
patch the way the user intended".
That is false. Diffing, patching, and merging tools are
inevitably imprecise, heuristic instruments. They will always
have pathological cases that are common enough to arise
accidentally, in practice, at measurable-enough-to-worry-about
rates. No cleverness in merge algorithm design is going to change
any of that, ever.

This is a topic that often seems to get confused hackers talking
in circles because it gives rise to an accidental game. Programmer
A proposes merge algorithm X. Programmer B finds the
pathological cases in X. A proposes X'. B finds the X' problems.
Lather, rinse, repeat: this is apparently where much of the discourse
around open source revision control projects has stood pat since,
perhaps, 2002 or so. The punch line is that it's a shaggy dog story.
There is not X'''..'''' which fixes all the problems. A different *kind*
of solution is needed.

That other kind of solution is the "shop tools" approach. No
merge algorithm will always DTRT and the pathologies of every
known tool are significant. So, the solution is to make it easy to
not overlook the pathologies when they occur, and to have a variety
of specialized tools so that, when pathologies occur using one tool,
perhaps there is another tool to do the job instead. A variety of
tools -- hard to miss when pathologies happen -- easy to review what
a tool has just done. A very different problem from "produce a
better C program".

And this gets back to your "uncertain (5)": user intention.
If integrators are using shop tools, then when I thoughtfully
submit a "patch" I must be offering some raw materials (fragments
of new code) and a "construction plan" that tells how to use
those raw materials, plus tools, to make a modification to
the integrator's very own version of the program.

"Construction plans" as exchanged between more traditional
shops (such as wood shops) are human-to-human communication
of imprecise formulae. Perhaps my submitted plan describes
the construction of a 36" cabinet but you need a 24" cabinet.
Perhaps the plan calls for a bar clamp but you have something
else in mind. A good construction plan is then one that is lucid
enough for you to work with as you adapt it to your situation,
using the tools and materials you have at hand. When I submit
my patch, *that* is my intention: to give you a "workable" plan,
not a perfect one.

By definition, a "construction plan" can not be automatically
applied. Human judgment is called for to interpret and adapt
the plan. A merge tool "conforms to user intention" if both
the submitter and integrator agree about how, generally, it
will behave when manipulated with various standard tools
kept within nominal operating conditions.

In other words, the best way to be faithful to user intention is
first to build tools that behave unsurprisingly and, second, tools
that help software craftspeople efficiently exchange and manipulate
construction plans, each plan giving a description of how one revision
of a program can be changed into another.


-t
zooko
2008-02-26 21:03:17 UTC
Permalink
Post by Thomas Lord
I don't why you think Arch "cannot" do what Darcs does.
So to speak more precisely, arch by default uses a 3-merge, which
cannot track the location of lines precisely.
Post by Thomas Lord
Let's let patches be optionally *signed*.
Let every source tree contain a policy file,
[...]

Those are very interesting ideas. I think monotone may be working on
things like that.
Post by Thomas Lord
Post by zooko
uncertain assertion 5. Correctly answering both the resolution and
location questions is necessary and sufficient for "applying the
patch the way the user intended".
That is false. Diffing, patching, and merging tools are
inevitably imprecise, heuristic instruments.
Could you be more precise? :-)

My uncertain (5) is that the combination of "resolution does what the
user intended" and "location is precise" is necessary and sufficient
for "applying the patch the way the user intended". Oh, let me
clarify that I mean here applying the patch in the absence of a
conflicting intention by another user.

Could you be more precise in saying why those two qualities are
insufficient for that third quality? Is it that you were thinking of
the third quality as something more powerful than I was -- something
like "Do The Right Thing in the presence of conflicting intentions"?
Or is it that you think the resolution can never match the user's
intention, therefore the question of whether correct resolution plus
correct location equals correct patch application is irrelevant? Or
something else?
Post by Thomas Lord
A variety of
tools -- hard to miss when pathologies happen -- easy to review what
a tool has just done. A very different problem from "produce a
better C program".
I think this is an important insight -- there is a quality which is
different from "produce a better C program", and which is closer to
the realm of things that an algorithm can directly offer. This
quality is "user can better predict (or post facto understand) what
will happen when a patch was applied". This is the quality that I
was thinking of when I wrote "apply the patch the way the user
intended". This is similar to your "easy to review what a tool has
just done".

I agree that this is a very different problem from "produce a better
C program".
Post by Thomas Lord
In other words, the best way to be faithful to user intention is
first to build tools that behave unsurprisingly and, second, tools
that help software craftspeople efficiently exchange and manipulate
construction plans, each plan giving a description of how one revision
of a program can be changed into another.
Ah, sounds like an excellent strategy.

Regards,

Zooko
William Uther
2008-02-26 22:56:42 UTC
Permalink
Post by zooko
My uncertain (5) is that the combination of "resolution does what the
user intended" and "location is precise" is necessary and sufficient
for "applying the patch the way the user intended". Oh, let me
clarify that I mean here applying the patch in the absence of a
conflicting intention by another user.
I think your "resolution does what the user intended" includes the
third "patch semantics" dimension. I think you'd be better off
separating that out.

Maybe something along the lines of: If
i) the patch language can accurately represent user intent, and
ii) resolution can correctly identify that intent and render it in
the patch language, and
iii) location is precise

then this is sufficient for "applying the patch the way the user
intended".

As a side note, for a patch to be applied "correctly" I suspect that
all the patches in the system need to follow those constraints, not
just the patch being applied.

Cheers,

Will :-}
Thomas Lord
2008-02-27 04:49:11 UTC
Permalink
Post by zooko
Post by Thomas Lord
Let's let patches be optionally *signed*.
Let every source tree contain a policy file,
[...]
Those are very interesting ideas. I think monotone may be working on
things like that.
That is a very important area of work, imo, and it
deserves wider participation in the design process
than any single project can afford. If the monotone
project is working on it, good, but that isn't obviously
enough.

One feature that could benefit the free software community
is merge features that assist in "proportionate attribution" and
that work *across* the various revision control systems.
For example, when you look at the revision control history of,
say, RHEL -- you should be able to see that the system integrators
attributed 50% of the credit for changes in the latest release to,
say, the kernel group, 25% to the gnome project, etc. Digging
in, you should be able to recursively find that the gnome maintainer
gave 3% credit to that developer, 15% credit to that other "trusted
lieutenant, etc. Those can, in turn, be used to pay royalties.

Being able to track authentic histories of patch application *across*
particular
revision control tools is very important to the community, if we are
to obtain a truly secure economic foundation.

Merge tools should, ideally, be agnostic as to the choice of
revision control system. So should tools for auditing the
history of a revision (such as to discover the recursive,
proportionate credit attributions).
Post by zooko
Post by Thomas Lord
Post by zooko
uncertain assertion 5. Correctly answering both the resolution and
location questions is necessary and sufficient for "applying the
patch the way the user intended".
That is false. Diffing, patching, and merging tools are
inevitably imprecise, heuristic instruments.
Could you be more precise? :-)
Yes, but not concisely. That's the problem. People get bored
listening to a perfectly good but rather long explanation of why
there is no perfect merge algorithm and so they stop listening.
Since they don't learn there can't be a perfect merge algorithm,
they then paradoxically *waste* time BSing about their design
for one. It's a bit like the "perpetual motion machine" industry.
Post by zooko
My uncertain (5) is that the combination of "resolution does what the
user intended" and "location is precise" is necessary and sufficient
for "applying the patch the way the user intended". Oh, let me
clarify that I mean here applying the patch in the absence of a
conflicting intention by another user.
Could you be more precise in saying why those two qualities are
insufficient for that third quality? Is it that you were thinking of
the third quality as something more powerful than I was -- something
like "Do The Right Thing in the presence of conflicting intentions"?
Or is it that you think the resolution can never match the user's
intention, therefore the question of whether correct resolution plus
correct location equals correct patch application is irrelevant? Or
something else?
One of the "intentions" behind a patch can usefully be
that the patch is intended to be applied to trees that are
related, but for which a complete history is not available.
Closely related is the intention that the patch be useful even if the
history is available -- but "quirky" and not helpful to
merging. The patch is *documentation* of an actual change
between two actual revisions (usually). The patch is *intended*
as a description -- a "construction plan".

So, no, "resolution does what...." and "location is precise" are not
necessary or sufficient for much of anything, in the sense you are
using those (rather prejudicially chosen) names.

-t
Post by zooko
Post by Thomas Lord
A variety of
tools -- hard to miss when pathologies happen -- easy to review what
a tool has just done. A very different problem from "produce a
better C program".
I think this is an important insight -- there is a quality which is
different from "produce a better C program", and which is closer to
the realm of things that an algorithm can directly offer. This
quality is "user can better predict (or post facto understand) what
will happen when a patch was applied". This is the quality that I
was thinking of when I wrote "apply the patch the way the user
intended". This is similar to your "easy to review what a tool has
just done".
I agree that this is a very different problem from "produce a better
C program".
Post by Thomas Lord
In other words, the best way to be faithful to user intention is
first to build tools that behave unsurprisingly and, second, tools
that help software craftspeople efficiently exchange and manipulate
construction plans, each plan giving a description of how one revision
of a program can be changed into another.
Ah, sounds like an excellent strategy.
Regards,
Zooko
_______________________________________________
Revctrl mailing list
Revctrl at lists.zooko.com
http://lists.zooko.com/mailman/listinfo/revctrl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.zooko.com/pipermail/revctrl/attachments/20080226/a5179006/attachment.htm
zooko
2008-02-27 13:20:29 UTC
Permalink
Post by Thomas Lord
One feature that could benefit the free software community
is merge features that assist in "proportionate attribution" and
that work *across* the various revision control systems.
Hm.

http://ohloh.net seems to be generating data like this. I haven't
looked into it deeply. The are open sourcing at least some of their
automation.

I take it that you are (still) interested in the question of how to
close the feedback loop from the value of Free Software to the
production of it? I am deeply interested in that issue, too.
Post by Thomas Lord
One of the "intentions" behind a patch can usefully be
that the patch is intended to be applied to trees that are
related, but for which a complete history is not available.
This is a good point. I would fit it into my framework by saying
that precise location requires complete history, and may require
complex computation over that complete history.

In fact, computation of precise location may be impractical. Darcs
is admirable for showing that it is possible to solve location
precisely, but darcs has not yet shown that it is possible to do so
with an algorithm which is guaranteed to be computable in reasonable
time in the worst case.

SCCS/BitKeeper/Codeville weaves do precise location at least some of
the time, and with excellent performance even in the worst case, but
I personally don't understand them well enough to understand if they
do precise location in all cases.

So, getting back to your objection, one could reasonably say, using
my framework, "Yes, user-intended resolution plus precise location
implies user-intended patch application (in the absence of
conflicts), but precise location requires complete history, which
cannot be guaranteed.".
Post by Thomas Lord
Closely related is the intention that the patch be useful even if the
history is available -- but "quirky" and not helpful to
merging. The patch is *documentation* of an actual change
between two actual revisions (usually). The patch is *intended*
as a description -- a "construction plan".
This I don't really understand. Are you saying that someone may give
you a patch with the intent that you *not* apply with an automated
tool, but instead with a combination of human and tool?

Regards,

Zooko
Thomas Lord
2008-02-27 18:18:41 UTC
Permalink
Post by zooko
I take it that you are (still) interested in the question of how to
close the feedback loop from the value of Free Software to the
production of it? I am deeply interested in that issue, too.
Good to know.
Post by zooko
So, getting back to your objection, one could reasonably say, using
my framework, "Yes, user-intended resolution plus precise location
implies user-intended patch application (in the absence of
conflicts), but precise location requires complete history, which
cannot be guaranteed.".
And, I say "No" to that because the "user-intention" behind a
patch application is expressed in terms of the human goals of a project
while, on the other hand, technical matters like "resolution" and
"location" do not form a mathematical model suitably rich to
describe the space of "human goals".

Consider a patch which corrects all misspelled occurrences of the
word "miscellaneous" in the source files of a program. The commit
comment might say "Fix a ubiquitous spelling mistake" and the
comment might even include a little shell script that automatically
makes the change.

"User intent" is best captured in that case by applying the patch
in the ordinary way if you are starting from the exact same base
revision, or by instead running the shell script otherwise. This
illustrates how "user intent" -- which is about human goals -- can't
be modeled by "resolution" and "location".

A more abstract example concerns *line duplication*. Line motion
is one thing but what about duplication? When is it the case that
a user's intent is to apply a given patch hunk to *all copies of*
certain lines? Again: "user intent" is not on the map of resolutions
and locations.
Post by zooko
Post by Thomas Lord
Closely related is the intention that the patch be useful even if the
history is available -- but "quirky" and not helpful to
merging. The patch is *documentation* of an actual change
between two actual revisions (usually). The patch is *intended*
as a description -- a "construction plan".
This I don't really understand. Are you saying that someone may give
you a patch with the intent that you *not* apply with an automated
tool, but instead with a combination of human and tool?
Perhaps. For example, an integrator of a certain skill level might
approach a merge by first trying a 3-way and examining the results.
If unproblematic, fine, but the skill would be to recognize when a
merge is failing in a way that is likely to be reduced by using
patch commutation ("Darcs-style"), and then to know how to use
a tool that tries that other approach. Another skill would be to read
commit comments for things like "the shell script that produces this
change" and make special exceptions for something like merging a
spelling correction.

In general, a committed patch is a construction plan. An integrator
reads a description of what the patch is alleged to do. He can study
the patch itself as well as the revision it originally produced. He
can infer, from those things, the semantics of the changes intended
to be made to the target program. The integrator can then envision
how analogous changes would be made to the program he wants
to patch. *Then* and only then do merge tools enter the picture
and the question is: "Which of these merge-oriented editing tools
will be the most helpful in quickly making the change I have
envisioned to this program?"

-t

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.zooko.com/pipermail/revctrl/attachments/20080227/bb78aded/attachment.htm
William Uther
2008-02-26 22:45:23 UTC
Permalink
Post by zooko
Right -- I generally agree with what Michael Richters, Walter
Franzini, Tom Lord, Marnix Klooster, and William Uther have posted in
this thread.
I like to break down the parts of the revision control algorithm like
[snip resolution and location definitions]

In general I agree with those definitions. I think it is important to
include a third concept here as well: patch semantics.

e.g. If I move a function in a file, most current resolution algorithms
will interpret that as a deletion of some old lines, and the addition
of some new lines. A system that could explicitly represent line
moving would capture that patch intent better.

e.g. 2: DARCS's search/replace patches (never used them, but it's the
thought that counts).

Both of these examples are independent of resolution and location.
(I guess another way to look at this is that resolution is impossible
if you cannot represent the *entire* patch semantics, but that seems
a little strong to me.)

If you could get resolution, location and patch semantics correct, then
you'd be doing well. Patch semantics is *really hard* though.

The next question is what happens when the patch semantics are wrong.
It would be nice if you could guarantee for a certain mismatch the
result was simply extra conflicts (and I could imagine that guarantee
holding for "move, add, delete" vs "add, delete" patch semantics).

Be well,

Will :-}
Peter Miller
2008-02-27 23:01:04 UTC
Permalink
Post by zooko
There is "resolution" (terminology chosen by the Codeville guys),
which is the process of looking at two snapshots of the filesystem
and inferring what sort of change the user meant. This is the part
where the hunks can be "not what the user intended" while still being
}
void g(void)
{
printf("g!\n");
This is a good example of the problem with patches as first class
entities. For a "minimum edit distance" diff algorithm (or even the
"longest subsequence" diff algorithm) there are in fact three equally
valid patches:

blah blah blah blah blah blah
+} } }
+ +
+void g(void) +void g(good) +void g(vood)
+{ +{ +{
} +} +}
+
yada yada yada yada yada yada

The diff chooses to output one of them, but it must be stressed that
*all* are equally correct, because all have the *same* edit distance.

I really would like it if folks would stop saying "guess". This isn't a
guess, it is simply an arbitrary choice of one answer amongst N
equal-metric answers. This is a data-losing transform, but it isn't a
guess.

Merging two diffs becomes interesting, because the first diff is "just
one of N" and the second diff is "just one of M" but the merge algorithm
is NOT seeing all of the information. Thus the two diffs get combined
into a single output which may, in fact, be suboptimal, compared to a
merge which had all N*M possibilities available... and like the input
diffs, the merge must inevitably present only one of many equal-metric
answers.

This is why tree states, IMO, are better than patches: they contain more
information.

There are times I think we get too hung up on patches as an
implementation, and lose sight of the fact that the implementation is
not the design, that the diff/patch implementation does not represent
the entire solution space. Sometimes it helps to back up and look at
the problem unblinkered by decades of using one particular solution.

Yes, patches are handy for email.
Yes, patches make for simple code reviews.
Yes, patches can be signed.
No, patches aren't the only answer.


Regards
Peter Miller <millerp at canb.auug.org.au>
/\/\* http://miller.emu.id.au/pmiller/

PGP public key ID: 1024D/D0EDB64D
fingerprint = AD0A C5DF C426 4F03 5D53 2BDB 18D8 A4E2 D0ED B64D
See http://www.keyserver.net or any PGP keyserver for public key.

"We're still almost done again." -- Final Cut Pro easter egg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080228/7add7f60/attachment.pgp
Thomas Lord
2008-02-28 01:21:48 UTC
Permalink
Post by Peter Miller
I really would like it if folks would stop saying "guess". This isn't a
guess, it is simply an arbitrary choice of one answer amongst N
equal-metric answers. This is a data-losing transform, but it isn't a
guess.
It's a "guess" in the sense that the tools are are
assuming that any member of the set of minimal-edit-distance
solutions is a member of the set of optimal solutions.

There are multiple, equally good guesses (sometimes) in
that view -- but all of them are still guesses.

"lines" have almost no relation at all to the semantics of programs
or much else besides. It's a "guess" at the core.

-t
William Uther
2008-02-27 23:49:22 UTC
Permalink
Post by Peter Miller
Post by zooko
There is "resolution" (terminology chosen by the Codeville guys),
which is the process of looking at two snapshots of the filesystem
and inferring what sort of change the user meant. This is the part
where the hunks can be "not what the user intended" while still being
}
void g(void)
{
printf("g!\n");
This is a good example of the problem with patches as first class
entities. For a "minimum edit distance" diff algorithm (or even the
"longest subsequence" diff algorithm) there are in fact three equally
blah blah blah blah blah blah
+} } }
+ +
+void g(void) +void g(good) +void g(vood)
+{ +{ +{
} +} +}
+
yada yada yada yada yada yada
The diff chooses to output one of them, but it must be stressed that
*all* are equally correct, because all have the *same* edit distance.
All have the same edit distance, so *as far as the system is concerned*
they are all equally correct. However, the user has more information
here, including an understanding of the semantics of what they're
writing. That extra information can be useful, and it might be nice
to store it, even if you choose not to use it later.
Post by Peter Miller
I really would like it if folks would stop saying "guess". This isn't a
guess, it is simply an arbitrary choice of one answer amongst N
equal-metric answers.
Um, "arbitrary choice of one answer amongst N equal-metric answers",
where some capture what the user intended and some don't, seems to
me to be as close as matters to "guess". But if you want to use the
term "data-losing transform", ok.

I would add that if you put the user in the loop here, the transform
doesn't lose as much data.
Post by Peter Miller
Merging two diffs becomes interesting, because the first diff is "just
one of N" and the second diff is "just one of M" but the merge
algorithm
is NOT seeing all of the information. Thus the two diffs get combined
into a single output which may, in fact, be suboptimal, compared to a
merge which had all N*M possibilities available... and like the input
diffs, the merge must inevitably present only one of many equal-metric
answers.
This is why tree states, IMO, are better than patches: they contain more
information.
Hrm. That's interesting because I think about it in exactly the
opposite
way.

From a complete sequence patches, you can recover the trees if you
want.
You can then consider all M*N possibilities if you choose. i.e. The
trees do NOT contain more information than a complete sequence of
patches, although they may contain more information than individual
patches in the sequence.

In fact, iff the user is in the loop then the patches contain more
information than the trees. In particular, the user has identified the
best patch of the N possible patches, and the best patch of the M
possible patches (where by 'best' I mean "best fits their semantic
knowledge of the change they were trying to make"). This means that
any system using the patches can make use of that extra information
if they so choose. Or they could simply re-construct the trees and
ignore the extra information.
Post by Peter Miller
There are times I think we get too hung up on patches as an
implementation, and lose sight of the fact that the implementation is
not the design, that the diff/patch implementation does not represent
the entire solution space. Sometimes it helps to back up and look at
the problem unblinkered by decades of using one particular solution.
AFAIK there is only one modern system that uses patches to do the
merging: Darcs. And, as has been noted, it doesn't really give the
user enough control. As for the other systems: git doesn't, svn
doesn't, monotone doesn't, I don't think mercurial does, codeville
doesn't and bazaar used to not use patches but I've heard someone
say they're not using weaves any more, so I'm not really sure.

Oh, so6 does too.
Post by Peter Miller
Yes, patches are handy for email.
Yes, patches make for simple code reviews.
Yes, patches can be signed.
No, patches aren't the only answer.
And most systems don't use them :). Many of the above systems
do use delta-compressed storage, but to merge they simply
reconstruct the trees and go from there.

Be well,

Will :-}
zooko
2008-02-28 00:07:15 UTC
Permalink
Post by William Uther
AFAIK there is only one modern system that uses patches to do the
merging: Darcs. And, as has been noted, it doesn't really give the
user enough control. As for the other systems: git doesn't, svn
doesn't, monotone doesn't, I don't think mercurial does, codeville
doesn't and bazaar used to not use patches but I've heard someone
say they're not using weaves any more, so I'm not really sure.
Oh, so6 does too.
What does bazaar do now?

What's so6? Oh: [1]. News to me.

You know, I looked away from the Free Software decentralized revision
control tools for a moment (okay, maybe for a year), and a bunch of
things have changed. I just learned today that Vesta [2] uses
Precise Codeville Merge [3] now, for example.

ESR launched on an ambitious task of gathering and surveying the
knowledge [4], but that hasn't twitched in a month.

Regards,

Zooko

[1] http://dev.libresource.org/home/doc/so6-user-manual
[2] http://www.vestasys.org/
[3] http://revctrl.org/PreciseCodevilleMerge
[4] http://thyrsus.com/lists/uvc-reviewers/
William Uther
2008-02-28 00:27:54 UTC
Permalink
Post by zooko
Post by William Uther
AFAIK there is only one modern system that uses patches to do the
merging: Darcs. And, as has been noted, it doesn't really give the
user enough control. As for the other systems: git doesn't, svn
doesn't, monotone doesn't, I don't think mercurial does, codeville
doesn't and bazaar used to not use patches but I've heard someone
say they're not using weaves any more, so I'm not really sure.
Oh, so6 does too.
What does bazaar do now?
No idea. I just made the claim that Bazaar used a weave on ESR's
mailing list and was told I was out of date.
Post by zooko
What's so6? Oh: [1]. News to me.
Yeah - it's an interesting system. Much like Darcs, but based on
OT theory rather than Darcs' own theory of patches. The two seem
very similar. so6 doesn't have an exponential case in the
merging, but it only allows distributed use if the distribution
is tree structured. I think that's not much of a limitation
most of the time (people sync with a single up-stream source),
but then there will be times when it is a show-stopper (when
you need to change up-stream sources).
Post by zooko
You know, I looked away from the Free Software decentralized revision
control tools for a moment (okay, maybe for a year), and a bunch of
things have changed.
heh. Why can't they all just stay neatly in their boxes :p

Be well all,

Will :-}
Thomas Lord
2008-02-28 02:16:24 UTC
Permalink
Post by zooko
ESR launched on an ambitious task of gathering and surveying the
knowledge [4], but that hasn't twitched in a month.
Not to get overly personal but I think (or is it "hope") that
some of his recent reticence is from a recognition that he
might have bit off a bit more than anyone could chew with his
initial ambitions in that project.

Perhaps the bug in his project is that it is too backwards looking.
He started off trying to discover history and to assess state and to
make current recommendations. Perhaps it is too late to catch those
waves. For example, the Emacs project seems to be finding solutions
just fine, on its own, as some predicted.

He should join us in talking about not what has happened in the
past, but first -- in what in contemplation of what is achievable
in the foreseeable future, if we are to put effort into it. That is,
to "re-ground" his primary capacity as advocate, if he is interested
in more than just gaming programs, he needs another *taste* (not
overwhelming expertise -- just a bits-level taste) of what is achievable.
For which purpose, instead of soliciting help with history and taxonomy,
he might instead host discussion of what is possible and desirable.


-t
Thomas Lord
2008-02-28 02:06:13 UTC
Permalink
Post by William Uther
Post by Peter Miller
Yes, patches are handy for email.
Yes, patches make for simple code reviews.
Yes, patches can be signed.
No, patches aren't the only answer.
And most systems don't use them :). Many of the above systems
do use delta-compressed storage, but to merge they simply
reconstruct the trees and go from there.
I get sick of saying this but, Arch has it all right in
design, albeit not in realization of design:

You first make up a taxonomy of "trees" -- what exactly
is a directory or a file? Does it have, for example,
some transcendent ID that survives even across renames?

You provide, as the core storage thing, a global, decentralized,
human-friendly, trees-and-sequences structured mostly-write-once
namespace for binding names to trees. That is, there's a global
namespace of trees and, in this context, we have a write-once,
hierarchical, version-numbered namespace of trees.

Given that core, third parties can define relations. For example,
someone could assert "Tree A is a delta (using the patch(1) algorithm)
that relates tree B to tree C". A revision control transaction, such
as a commit, is essentially the assertion of such a relation.

So, the global database is a loosely hierarchical/versioned space
of names, bound to arbitrary trees. The core functions are to
binding a name to a tree or retrieve the tree associated with
a name.

Every other aspect of revision control is either a layer above
or an optimization of *that*. As a practical matter, *hash-caching*
is probably all the optimization anyone actually needs.
That is: express all of your revision control high-level operations
(complex transactions) as brute-force algorithms that read from
and write to the global namespace of trees -- but that partition their
work such that each expensive step breaks down into sub-steps that
are likely to be needed more than once. Then create a cache of
the result of each sub-step.

The "skip-delta" trick used in the Subversion database and the
"revision library" trick used in GNU Arch are both examples of
limited, overly specialized, hand-coded hash-cashing. They could
both be subsumed under a more general system that knows how to
hash-cash trees in general -- not just a tree-representation of a delta
("skip deltas") or a tree representation of a complete revision ("revision
libraries"). Given two tree names and a well-known function that
produces a third tree: look the result of the function up in the cache or
compute it by brute force and store it in the cache. The brute-force
computation hopefully amounts to breaking it down into simpler
cache look-ups.

The fundamental intention of a patch submitter in distributed,
decentralized revision control could be described as: "Let the world
know that I said X". The core of a revision control system should
do nothing more than record that fact -- and make it efficient to
cache computations of simple, decomposable relations among
the trees thus stored.

Given such a minimalist core, our discourse about things like
how to use "history" in "merging" would change flavor. We could
begin to express our competing views as very high-level algorithms -- useful
for discourse but that can also just be run as programs -- written against
the core hash-cached relations of trees functionality.

Some f'ed up conjunction of economic circumstances has caused
us all to have the concept of "revision control system" as a product
category and therefore to define and attempt to solve problems with
respect to the boundaries around that product category. Yet, underlying
what is needed, is a much more general thing: user-contributed datums --
named trees for example -- and functional, cachable relations between
these -- all
organized as a distributed, decentralized system.

"Zooko's triangle," indeed.

-t



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.zooko.com/pipermail/revctrl/attachments/20080227/9194da63/attachment.htm
Peter Miller
2008-02-28 01:06:20 UTC
Permalink
Post by William Uther
Um, "arbitrary choice of one answer amongst N equal-metric answers",
where some capture what the user intended and some don't, seems to
me to be as close as matters to "guess". But if you want to use the
term "data-losing transform", ok.
When I say that the value of the square root of 2 is
1.4142135623730950488..., I am not accused of "guessing", even though
this is only 1 of 2 correct answers. Higher roots are even guessing-er.


Regards
Peter Miller <millerp at canb.auug.org.au>
/\/\* http://miller.emu.id.au/pmiller/

PGP public key ID: 1024D/D0EDB64D
fingerprint = AD0A C5DF C426 4F03 5D53 2BDB 18D8 A4E2 D0ED B64D
See http://www.keyserver.net or any PGP keyserver for public key.

2000 pounds of Chinese soup = Won ton
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080228/e8d7a649/attachment.pgp
William Uther
2008-02-28 01:59:17 UTC
Permalink
Post by Peter Miller
Post by William Uther
Um, "arbitrary choice of one answer amongst N equal-metric answers",
where some capture what the user intended and some don't, seems to
me to be as close as matters to "guess". But if you want to use the
term "data-losing transform", ok.
When I say that the value of the square root of 2 is
1.4142135623730950488..., I am not accused of "guessing", even though
this is only 1 of 2 correct answers. Higher roots are even guessing-
er.
When people ask for the "square root" of a number, they're implicitly
asking for the positive root. This is just a fact of the english
language. Note that when writing both roots one normally uses +/- out
the
front to capture the fact that you're referring to both.

If I asked you for "one of the prime factors of 12, that I'm
thinking of", and you say "2", then you're guessing.

But enough arguing by analogy over a minor point of terminology.

Be well,

Will :-}
Thomas Lord
2008-02-28 03:56:09 UTC
Permalink
Post by Peter Miller
Post by William Uther
Um, "arbitrary choice of one answer amongst N equal-metric answers",
where some capture what the user intended and some don't, seems to
me to be as close as matters to "guess". But if you want to use the
term "data-losing transform", ok.
When I say that the value of the square root of 2 is
1.4142135623730950488..., I am not accused of "guessing", even though
this is only 1 of 2 correct answers. Higher roots are even guessing-er.
The square root of 2 is objectively defined.

The "best patched inferred by comparing two versions" is subjective.

So, it's not a good analogy.

Or, if there is analogy, it is that your approximation is as good as
another one of the square root of 2, but it is just a guess that
approximations of the square root of 2 are the desired number.

-t
Marnix Klooster
2008-02-26 17:04:23 UTC
Permalink
Post by William Uther
Post by Thomas Lord
That's not strictly true. Darcs is making use of
heuristically estimated code motion. So Darcs merge
solution is, actually, still a guess -- just a different kind
of guess.
Hi,
Tom didn't say DARCS was guessing when merging. He said that the
DARCS
merge *solution* is still based on guessing. This is true for exactly
the
reason you identified - the initial patch is a guess. The merge
algorithm
suffers from "Garbage In, Garbage Out", or rather "Guess In, Guess Out".
OK, fair enough.
Post by William Uther
If DARCS had a way for the user to check that the patch was 'correct'
before it was committed, then there would be no guessing in the
solution.
In that case the user could catch the example you gave of a bad diff.
Actually, darcs *does* let the user check the patch that it creates:

$ darcs record
hunk ./src/fgh.c 14
+
+void g(void)
+{
+ printf("g!\n");
+}
Shall I record this change? (1/?) [ynWsfqadjkc], or ? for help: y
What is the patch name? Add g.
Finished recording patch 'Add g.'
$

(BTW, we see that in my actual example, darcs creates a diff that user
probably finds OK.) So the UI shows the detail of the patch as it will
be created. Yes, the UI could be more sophisticated, by showing more
context of the diff, or even a 'visual' diff, instead of just the line
number. But all information is there, and the user can check the
correctness of the patch right when it is created.

That said, while the user can *catch* a bad diff, current darcs does not
have a way to *influence* the diff that darcs chooses. But that is
basically a UI issue, that could be added without any changes to the
precise merging semantics. (For example, the user could type 'e' for
'edit', and be presented with a version of src/fgh.c with special
markers around the chosen 'hunks', which the user then can move/copy to
indicate the parts that need to be separate 'hunks'.)

Groetjes,
<><
Marnix
William Uther
2008-02-26 22:32:22 UTC
Permalink
Post by Marnix Klooster
Post by William Uther
If DARCS had a way for the user to check that the patch was
'correct'
before it was committed, then there would be no guessing in the
solution.
In that case the user could catch the example you gave of a bad diff.
[snip]
Post by Marnix Klooster
(BTW, we see that in my actual example, darcs creates a diff that user
probably finds OK.) So the UI shows the detail of the patch as it will
be created. Yes, the UI could be more sophisticated, by showing more
context of the diff, or even a 'visual' diff, instead of just the line
number. But all information is there, and the user can check the
correctness of the patch right when it is created.
That said, while the user can *catch* a bad diff, current darcs does not
have a way to *influence* the diff that darcs chooses. But that is
basically a UI issue, that could be added without any changes to the
precise merging semantics.
Agreed. I thought of producing such a UI at one point, before life
intervened. Sigh.

I was thinking of something like a normal merge algorithm/UI, but
where the
user could specify specific constraints. The diff algorithm where you
first identify unique lines is similar, but here you'd first have "user
constrained equal" lines, then unique lines given those constraints...

The UI would be something along the lines of a normal merge UI - a split
window with a file on each side. You can drag a line from one side on
top of a line on the other side to specify that the two lines are
constrained to be equal.

Be well,

Will :-}
Walter Franzini
2008-02-26 21:07:51 UTC
Permalink
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080226/09c09cbf/attachment.pgp
Robert Collins
2008-02-26 21:14:18 UTC
Permalink
Post by zooko
My brother asked me for advice on choosing a revision control tool,
so that prompted me to update this old quick ref for the first time
https://zooko.com/revision_control_quick_ref.html
Interesting that you record aegis as 'decentralised: no' - I suspect
Peter Miller, and the numerous contributors who have sent him patchsets
would disagree :).

-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080227/0ebf7ad1/attachment-0001.pgp
zooko
2008-02-27 15:55:49 UTC
Permalink
Post by Robert Collins
Post by zooko
My brother asked me for advice on choosing a revision control tool,
so that prompted me to update this old quick ref for the first time
https://zooko.com/revision_control_quick_ref.html
Interesting that you record aegis as 'decentralised: no' - I suspect
Peter Miller, and the numerous contributors who have sent him
patchsets
would disagree :).
To maintain a page like that one of course requires a fine balance
between being long-winded and being wrong.

Currently my attempt to strike that balance is: "decentralized --
Whether the tool facilitates code-sharing among developers who are
independent and who are cooperating only loosely or even not at all.".

Would you say that Aegis facilitates code-sharing among such developers?

As far as I understand, it has centralized hosting, such that all
developers who want to use it have to rely on the owner of the
central repository.

Regards,

Zooko
Walter Franzini
2008-02-27 16:29:12 UTC
Permalink
zooko <zooko at zooko.com> writes:

[...]
Post by zooko
To maintain a page like that one of course requires a fine balance
between being long-winded and being wrong.
I think the "decentralised: no" value *is* wrong :-)
Post by zooko
Currently my attempt to strike that balance is: "decentralized --
Whether the tool facilitates code-sharing among developers who are
independent and who are cooperating only loosely or even not at all.".
Would you say that Aegis facilitates code-sharing among such developers?
Yes. I'm going to explain why and since I don't have direct
experience with other tools that have the "decentralised: yes" field
I'll let you judge if it's enought to have the flag toggled.

Obviously Aegis development is under Aegis.
Currently there are two public repositories for Aegis the one owned by
Peter Miller hosted on http://aegis.sourceforge.net/ and the one I use
to publish my contribution (hosted at http://aegis.stepbuild.org/).

I track Peter Miller changes using aedist -{missing,replay,pending}
commands and I think Peter tracks my repository in the same way.

The -missing subcommand lists the changes on the remote repository
that are not in my repository.

The -pending subcommand lists the changes on my repository not yet in
Peter's one.

The -replay subcommand fetch the missing changesets and apply it to my
repository. By "apply" I mean apply in the Aegis way:

0) download the aedist archive
1) create and develop_begin a new change
2) delete/copy/create any needed file
3) apply the patch in the archive *or* use the full source of the file
4) run the merge tool as needed
5) develp_end the change
6) if configured to do so integrate the change into the baseline
7) if there are more missing changesets goto 0.
Post by zooko
As far as I understand, it has centralized hosting, such that all
developers who want to use it have to rely on the owner of the
central repository.
I don't understand but I think the text above should show that
decentralized hosting is possible.

--
Walter Franzini
http://aegis.stepbuild.org/

PGP Public key ID: 1024D/CB3FEB43
Key fingerprint : FA26 C33B CAFF 7848 EFEB 7327 96AA 2D57 CB3F EB43
Key server : http://www.keyserver.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080227/a22fe197/attachment.pgp
zooko
2008-02-28 12:56:53 UTC
Permalink
Post by Walter Franzini
Post by zooko
Currently my attempt to strike that balance is: "decentralized --
Whether the tool facilitates code-sharing among developers who are
independent and who are cooperating only loosely or even not at all.".
Would you say that Aegis facilitates code-sharing among such
developers?
Yes. I'm going to explain why and since I don't have direct
experience with other tools that have the "decentralised: yes" field
I'll let you judge if it's enought to have the flag toggled.
...
Post by Walter Franzini
0) download the aedist archive
1) create and develop_begin a new change
2) delete/copy/create any needed file
3) apply the patch in the archive *or* use the full source of the file
4) run the merge tool as needed
5) develp_end the change
6) if configured to do so integrate the change into the baseline
7) if there are more missing changesets goto 0.
Okay, I'm convinced. Toggling the "decentralized" flag on aegis.

How long has this been the case??

Regards,

Zooko
Walter Franzini
2008-02-28 16:18:07 UTC
Permalink
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
Url : http://lists.zooko.com/pipermail/revctrl/attachments/20080228/db0f8857/attachment.pgp
Loading...