Recent comments posted to this site:
I'd like to reiterate a question that was unanswered above:
Is there a way to tell the S3 backend to store the files as they are named locally, instead of by hashed content name? i.e., I've annexed foo/bar.txt and annex puts it in s3 as mybucket.name/foo/bar.txt instead of mybucket.name/GPGHMACSHA1-random.txt
Wow, scary
Dilyin's comment is scary. It suggests bad things can happen, but is not very clear.
Bloated history is one thing.
Obviously broken repo is bad but can be (slowly) recovered from remotes.
Subtly crippled history that you don't notice can be a major problem (especially once you have propagated it to all your remotes to "recover from bloat").
More common than it seems
There's a case probably more common than people actually report: mistakenly doing git add
instead of git annex add
and realizing it only after a number of commits. Doing git annex add
at that time will have the file duplicated (regular git and annex).
Extra wish: when doing git annex add
of a file that is already present in git history, git-annex
could notice and tell.
Simple solution?
Can anyone elaborate on the scripts provided here, are they safe? What can happen if improperly used or in corner cases?
- "files are replaced with symlinks and are in the index" -> so what ?
- "Make sure that you don't have annex.largefiles settings that would prevent annexing the files." -> What would happen? Also
.gitattributes
.
Thank you.
@davidriod if you're using a rsyncurl that uses ssh, then yes, the transmission goes over a secure connection. If the rsyncurl uses the rsync protocol, there would be no encryption.
Of course, encryption=none does not keep the data encrypted at rest, so the admin of the rsync server can see it etc.
@bec.watson, better to open a bug report for this kind of problem.
Seems that the AWS library that git-annex is using does not use V4 authorization yet. Work is in progress: https://github.com/aristidb/aws/pull/199
As to the endpoint hostname, that must be a special case for China, so I've made git-annex aware of that special case.
@spiderbit, to support recursively adding content, it would need to parse html, and that's simply too complex and too unlikely to be useful to many people.
Sorry, should have read the man page.
of course I have to use %url and %file
So it works with "rsync %url %file" but it seems to not work recursive also it renames the files instead of adding them with their normal name. So not useful for what I want to do.
I want to access a normal unmodified directory on my server and add the files to my local directory. That would be a minimal setup, everything else means just extremly big setups with assistant runnig and a cronjob to delete unused files, and lots of cpu load for indexing this files on the server.
I think such minimal setup would be great to get startet without having very complex setups, you dont want to commit to such a tool and hours of setup to get somethnig useful, just to look if its useful for you.
there are 2 aproaches either have a normal repos on the server, with again cronjob and flat setup. which is quite a setup to get.
or use only a real repos on the client, both have big disadvantages normal repos complex setup, to complex to just test it. web links seem to be simple enough but is not recursive therefor only good for youtube links or stuff like that.
Is there really no simple solution for what I want to do?
I guess adding a hourly cronjob that drop all unused filed would be accaptable maybe?
Or is there a better solution?