Technology Committee/Project requests/FTP site to support projects

From Wikimedia UK
Jump to navigation Jump to search

Overview[edit | edit source]

Media related projects can involve a lot of temporary space for storing video and images that may be processed in cooperation with other members of project teams. This would be particularly handy for quickly uploading files at events that may later be uploaded to Commons after either file processing (especially for video and audio which have to be converted to open standards), cooperative discussion on how best to add good quality metadata to the files before release or where there were potential copyright issues that needed confirmation/review before uploading to Commons.

Recent examples include:

  • Sharing WLM files - photos from Wiki takes Chester were extracted from Commons and the result, over 2GB was supplied to our event hosts. This involved a 3 hour upload via a slow free website
  • XenoCanto audio files - this was a Commons project that required reprocessing MP3 file to ogg standard before upload to commons, just under 1GB was shared in that cooperation. Sharing was achieved via Dropbox but it was complex to negotiate
  • Noaabot - this was an upload of 10 years worth of weather maps to Commons, being 22,000 image files. Converting archived gifs to png formats initially relied on creating a page on the toolserver, a facility which is being phased out, and was only useful for sharing one-way via web-page access. This was another over-complex solution compared to an open FTP site

In terms of security, it would make sense for the facility to be passworded (and the password periodically changed) and for this to remain a temporary space rather than a library, this could be achieved by automatically deleting files over 90 days old.

Budget[edit | edit source]

Potentially none or a low maintenance cost, if a secure and reliable hosting of FTP could be arranged on current office equipment or another chapter has a facility they could share. If I were to procure this, I would look for a cheap solution providing 250GB or more for a budget of less than £10/month and I would have a chat with WMCH to check for inter-chapter options.

Timeline[edit | edit source]

Indefinite. Success would be measured by reports back from projects that needed file-sharing space and benefited from this facility.

Expected outcomes[edit | edit source]

A range of projects where temporary file-sharing does not have to depend on closed systems such as Dropbox or Google drive where volunteers waste significant time juggling options when reaching file size limits, and bending the rules on how these free services are supposed to be used.

This might be made available and promoted for relevant use for all members known to be active on projects, and therefore be seen as a benefit of membership.

Who I am[edit | edit source]

I'm , one of the top ten unpaid content contributors to Wikimedia Commons and with many successful projects.

Discussion[edit | edit source]

(Historical note: this was originally submitted as a microgrant, but it's now been migrated to a tech committee proposal. Mike Peel (talk) 21:55, 23 October 2013 (UTC))

Just to note: the microgrant guidelines have "Website hosting (although webspace can be provided if needed by other means)" under "Things that won't be considered", which was included specifically to avoid requests like this, as they'd be better suited to being asked of the tech committee. Thanks. Mike Peel (talk) 16:13, 22 October 2013 (UTC)
Please close this request in that case. If someone on the Tech Committee wants to pick this up, then it might move forward. If WMUK is not interested, then I'd like to know that definitively, as I can pursue this for global Commons projects as a non-Chapter proposal. Thanks -- (talk) 18:26, 22 October 2013 (UTC)
Hi Fae. We discussed this at Technology Committee last night and broadly there was agreement that this was something we'd be interested in supporting. In terms of a timeline for setting something up, it will depend on costs because a higher spending commitment would require board approval, but the consensus last night was it was unlikely to reach that spending threshold so I could approve and proceed once we've made sure everyone is happy with what is proposed. The minutes for the Tech Comm will be on wiki by the end of today FYI.
Are you happy for me to move this page to something within the Tech Comm part of the wiki, and then committee members can ask questions and we can hopefully take a view on recommending we set something up? Katherine Bavage (WMUK) (talk) 10:40, 23 October 2013 (UTC)
By all means. I'm reading this microgrant request as rejected, so there will be no progress here. -- (talk) 10:43, 23 October 2013 (UTC)
Yes, but also that its a project the Tech Comm would like to support, so there will (hopefully) be progress there instead? Katherine Bavage (WMUK) (talk) 10:57, 23 October 2013 (UTC)
I support this proposal. It would have been useful for WLM, and I'd expect it will be useful for anything similar we might want to do next year as well. I don't know whether we can host it ourselves at lower cost than hiving it out to a separate hosting company? --MichaelMaggs (talk) 11:57, 23 October 2013 (UTC)
This proposal is at least partly a result of some requests that I made to the UK community and where Fae was the volunteer who stepped forward to action them. As we work more with Museums and archives to get releases of information we can anticipate more scenarios where we are receiving batches of data that we can put here for a bot operator such as Fae to upload to Commons. We should also expect more transfers in the opposite direction, where we want to respond to an image donation by supplying the donor with an extract from Commons of images that meet their collection criteria. Fae makes the point that the alternative free site that he used was excessively slow, I would also add that the free option only hosts the file for 7 days. In short, this is a modest request from a volunteer for tools that would assist the sort of work they do for us, and as such we should do this unless we have a better option. Jonathan Cardy (WMUK) (talk) 16:55, 23 October 2013 (UTC)

This request was made by for a microgrant but was rejected as microgrants as '"Website hosting"' is not eligible for micro-grant support. It was determined that it would be appropriate to re-propose it as a Technology project request. Please see full request here.

Do we know roughly (?) how much use this would see? It would be helpful to estimate costs - I assume we'll have to think about funding it for a year and review so if we could rough this out that would help.
What kind of gatekeeping (if any?) is in place? How would people access it (sorry if that's a silly question)
How can we protect the charity in terms of reputation in terms of security re inappropriate use i.e. inappropriate images, illegal file sharing etc :) Katherine Bavage (WMUK) (talk) 12:15, 23 October 2013 (UTC)
Btw anyone is welcome to come in on the above with ideas - I don't expect Fæ to necessarily have the answers to all. Katherine Bavage (WMUK) (talk) 12:16, 23 October 2013 (UTC)

I support this proposal. It would have been useful for WLM, and I'd expect it will be useful for anything similar we might want to do next year as well. I don't know whether we can host it ourselves at lower cost than hiving it out to a separate hosting company? --MichaelMaggs (talk) 16:51, 23 October 2013 (UTC)

Hosting options[edit | edit source]

Here's a quick look into some of the hosting options available for this. Please expand/comment! Thanks. Mike Peel (talk) 22:23, 23 October 2013 (UTC)

Host Cost of space Cost of bandwidth Limitations Minimum cost
Rackspace cloud storage 12p/GB 0 100-1000GB £12/month (100GB)
Amazon S3 Complex, but around $0.095/GB, or ~6p/GB Complex, but around $0.12/GB, or 7.5p/GB Extra costs for requests Unclear
Dropbox for Business -- -- Standard is unlimited storage for 5 users; extra users are $125/ea $795/yr for 5 users
WMCH TBA, possibly zero N/A None None
Dropbox looks pricey. I'm assuming that we would be using this for an assortment of partners who might only have one or two transactions each so charging by user would be very expensive.
There is also the issue of where the information would be. Assuming this is all for files for uploading to Wikimedia sites or extractions from Wikimedia sites then the Data Protection concerns are near nonexistent, so for this arrangement I would say we should take the cheaper of Amazon or Rackspace. Jonathan Cardy (WMUK) (talk) 14:19, 29 October 2013 (UTC)
As WMCH openly runs BBB for sister chapter use, and therefore has significant storage for ad-hoc video capture, tacking on some flexible FTP space also for inter-chapter support may be an interesting cooperation rather than going for an independent UK deal. -- (talk) 14:33, 29 October 2013 (UTC)

How to proceed[edit | edit source]

The Tech Committee has decided on a six-month trial. The Rackspace option was the preferred option as WMCH is theoretical at this stage.

Some terms of use need to be set out, namely how should it be used. If anyone has opinions on this, this section would be a good place to discuss it. Richard Nevell (WMUK) (talk) 16:51, 23 January 2014 (UTC)

Any comments? Richard Nevell (WMUK) (talk) 11:24, 31 January 2014 (UTC)
Ping. Richard Nevell (WMUK) (talk) 15:57, 3 February 2014 (UTC)

Terms of Use[edit | edit source]

  1. This site is only for temporary storage whilst files are being moved around, and all files will be autmatically deleted after 60 days.
  2. Users may only use this space for files extracted from Wikimedia projects or which are going to be batch uploaded to a Wikimedia project.
  3. Anything identified as out of scope ma be deleted.
IT Development
Main pageInfrastructureDocumentation / ToolsPortfolioTechnology CommitteeProject requests